Item response theory (IRT) item parameters can be estimated using data from a common item equating design either separately for each form or concurrently across forms. This paper reports the results of a simulation study of separate versus concurrent item parameter estimation. Using simulated data from a test with 60 dichotomous items, 4 factors were considered: (1) program (MULTILOG versus BILOG-MG); (2) sample size per form (3,000 versus 1,000); (3) number of common items (20 versus 10); and (4) equivalent versus nonequivalent groups taking the 2 forms (no mean difference versus a mean difference of 1 standard deviation). In addition, four methods of item parameter scaling were used in the separate estimation condition: two item characteristic curve methods (Stocking-Lord and Haebara), and two moment methods (Mean/Mean and Mean/Sigma). Although concurrent estimation resulted in less error than separate estimation more times than not, the results of this study, together with other research on this topic, are not sufficient to recommend completely avoiding separate estimation in favor of concurrent estimation. Two appendixes contain MUTILOG and BILOG-MG control files.
|Name||ACT research report|
|Publisher||American College Testing Program|
- Equated Scores
- Item Response Theory
- Estimation (Mathematics)