Using patterns of summed scores in paper-and-pencil tests and CAT to detect misfitting item score patterns

R.R. Meijer

Research output: Contribution to journalArticleAcademicpeer-review

4 Citations (Scopus)


Two new methods have been proposed to determine unexpected sum scores on sub-tests (testlets) both for paper-and-pencil tests and computer adaptive tests. A method based on a conservative bound using the hypergeometric distribution, denoted p, was compared with a method where the probability for each score combination was calculated using a highest density region (HDR). Furthermore, these methods were compared with the standardized log-likelihood statistic with and without a correction for the estimated latent trait value (denoted as l*z and lz, respectively). Data were simulated on the basis of the one-parameter logistic model, and both parametric and non-parametric logistic regression was used to obtain estimates of the latent trait. Results showed that it is important to take the trait level into account when comparing subtest scores. In a nonparametric item response theory (IRT) context, on adapted version of the HDR method was a powerful alterative to p. In a parametric IRT context, results showed that l*z had the highest power when the data were simulated conditionally on the estimated latent trait level.
Original languageEnglish
Pages (from-to)119-136
Number of pages17
JournalJournal of educational measurement
Issue number2
Publication statusPublished - 2004



  • METIS-219664
  • IR-104287

Cite this