OBJECTIVE: To evaluate and compare the measurement precision and sensitivity to change of the Health Assessment Questionnaire disability index (HAQ DI), the Short Form 36 physical functioning scale (PF-10), and simulated Patient-Reported Outcomes Measurement Information System (PROMIS) physical function computer adaptive tests (CATs) with 5, 10, and 15 items, using item response theory-based simulation studies. METHODS: The measurement precision of the various physical function instruments was evaluated by calculating root mean square errors (RMSEs) between true physical function levels (latent physical function score) and estimated physical function levels. Measurement precision was evaluated at 9 levels of physical function, with 5,000 simulated response patterns per level. Sensitivity to change was evaluated by the ability of a simple statistical test to detect simulated change scores of small to moderate magnitude (standardized effect sizes 0.20, 0.35, and 0.50). RESULTS: RMSEs were smaller for the PROMIS physical function 15-item CAT (CAT-15) and CAT-10 than for the HAQ DI and PF-10 across all levels of the latent physical function scale. Only marginal improvement in performance was observed for the CAT-15 compared with the CAT-10, and the CAT-5 performed quite similarly to the HAQ DI and PF-10 across most levels of the latent physical function scale. Substantially improved sensitivity to change was observed for the CAT-10 compared with the HAQ DI and PF-10, particularly in detecting moderate effect sizes. CONCLUSION: Clearly higher measurement precision was observed for the PROMIS CAT compared with the HAQ DI and PF-10. Higher reliability also translated into lower sample size requirements for detecting changes in clinical status.