Simulating the null distribution of person-fit statistics for conventional and adaptive tests

R.R. Meijer, Edith van Krimpen-Stoop

Research output: Book/ReportReportProfessional

7 Downloads (Pure)

Abstract

Several person-fit statistics have been proposed to detect item score patterns that do not fit an item response theory model. To classify response patterns as not fitting a model, a distribution of a person-fit statistic is needed. The null distributions of several fit statistics have been investigated using conventionally administered tests, but less is known about the distribution of fit statistics for computerized adaptive testing (CAT). A three-part simulation to study this distribution is described. First the theoretical distribution of the often used l(z) statistic across theta levels in a conventional testing and in CAT testing was studied, where theta and estimated theta were used to calculate l(z). Also, the distribution of a statistic l*(z), that is corrected for the error in theta, proposed by T. Snijders (1998) was studied in both testing environments. Simulating the distribution of l(z) for the two-parameter logistic model for conventional tests was studied. Two procedures for simulating the distribution of l(z) and l*(z) in a CAT were examined: (1) item scores were simulated with a fixed set of administered items; and (2) item scores were generated according to a stochastic design, where the choice of the administered item i + 1 depended on responses to previously administered items. The third study was a power study conducted to compare detection rates of l*(z) with l(z) for conventional tests. Results indicate that the distribution of l(z) differed from the theoretical distribution in conventional and CAT environments. In a conventional testing situation, the distribution of l(z) was in accord with the theoretical distribution, but for the CAT the distribution differed from the theoretical distribution. In the context of conventional testing, simulating the sampling distribution of l(z) for every examinee, based on theta, resulted in an appropriate approximation of the distribution. However, for the CAT environment, simulating the sampling distributions of both l(z) and l*(z) was problematic. Two appendixes show the derivation of the l*(z) statistic and discuss modeling local dependence.
Original languageEnglish
Place of PublicationEnschede
PublisherUniversiteit Twente TO/OMD
Number of pages34
Publication statusPublished - 1998

Publication series

NameOMD research report
PublisherUniversity of Twente, Faculty of Educational Science and Technology
No.98-02

Fingerprint

Adaptive Test
Null Distribution
Person
Statistics
Adaptive Testing
Testing
Z-score
Sampling Distribution
Statistic
Logistic Model
Model Theory

Keywords

  • Scores
  • Item Response Theory
  • Test Items
  • Responses
  • Statistical Distributions
  • METIS-136545
  • Ability
  • Adaptive Testing
  • Foreign Countries
  • Simulation
  • IR-103775
  • Power (Statistics)

Cite this

Meijer, R. R., & van Krimpen-Stoop, E. (1998). Simulating the null distribution of person-fit statistics for conventional and adaptive tests. (OMD research report; No. 98-02). Enschede: Universiteit Twente TO/OMD.
Meijer, R.R. ; van Krimpen-Stoop, Edith. / Simulating the null distribution of person-fit statistics for conventional and adaptive tests. Enschede : Universiteit Twente TO/OMD, 1998. 34 p. (OMD research report; 98-02).
@book{dc31d00d45df4eb4b5e3a73b9526e75c,
title = "Simulating the null distribution of person-fit statistics for conventional and adaptive tests",
abstract = "Several person-fit statistics have been proposed to detect item score patterns that do not fit an item response theory model. To classify response patterns as not fitting a model, a distribution of a person-fit statistic is needed. The null distributions of several fit statistics have been investigated using conventionally administered tests, but less is known about the distribution of fit statistics for computerized adaptive testing (CAT). A three-part simulation to study this distribution is described. First the theoretical distribution of the often used l(z) statistic across theta levels in a conventional testing and in CAT testing was studied, where theta and estimated theta were used to calculate l(z). Also, the distribution of a statistic l*(z), that is corrected for the error in theta, proposed by T. Snijders (1998) was studied in both testing environments. Simulating the distribution of l(z) for the two-parameter logistic model for conventional tests was studied. Two procedures for simulating the distribution of l(z) and l*(z) in a CAT were examined: (1) item scores were simulated with a fixed set of administered items; and (2) item scores were generated according to a stochastic design, where the choice of the administered item i + 1 depended on responses to previously administered items. The third study was a power study conducted to compare detection rates of l*(z) with l(z) for conventional tests. Results indicate that the distribution of l(z) differed from the theoretical distribution in conventional and CAT environments. In a conventional testing situation, the distribution of l(z) was in accord with the theoretical distribution, but for the CAT the distribution differed from the theoretical distribution. In the context of conventional testing, simulating the sampling distribution of l(z) for every examinee, based on theta, resulted in an appropriate approximation of the distribution. However, for the CAT environment, simulating the sampling distributions of both l(z) and l*(z) was problematic. Two appendixes show the derivation of the l*(z) statistic and discuss modeling local dependence.",
keywords = "Scores, Item Response Theory, Test Items, Responses, Statistical Distributions, METIS-136545, Ability, Adaptive Testing, Foreign Countries, Simulation, IR-103775, Power (Statistics)",
author = "R.R. Meijer and {van Krimpen-Stoop}, Edith",
year = "1998",
language = "English",
series = "OMD research report",
publisher = "Universiteit Twente TO/OMD",
number = "98-02",

}

Meijer, RR & van Krimpen-Stoop, E 1998, Simulating the null distribution of person-fit statistics for conventional and adaptive tests. OMD research report, no. 98-02, Universiteit Twente TO/OMD, Enschede.

Simulating the null distribution of person-fit statistics for conventional and adaptive tests. / Meijer, R.R.; van Krimpen-Stoop, Edith.

Enschede : Universiteit Twente TO/OMD, 1998. 34 p. (OMD research report; No. 98-02).

Research output: Book/ReportReportProfessional

TY - BOOK

T1 - Simulating the null distribution of person-fit statistics for conventional and adaptive tests

AU - Meijer, R.R.

AU - van Krimpen-Stoop, Edith

PY - 1998

Y1 - 1998

N2 - Several person-fit statistics have been proposed to detect item score patterns that do not fit an item response theory model. To classify response patterns as not fitting a model, a distribution of a person-fit statistic is needed. The null distributions of several fit statistics have been investigated using conventionally administered tests, but less is known about the distribution of fit statistics for computerized adaptive testing (CAT). A three-part simulation to study this distribution is described. First the theoretical distribution of the often used l(z) statistic across theta levels in a conventional testing and in CAT testing was studied, where theta and estimated theta were used to calculate l(z). Also, the distribution of a statistic l*(z), that is corrected for the error in theta, proposed by T. Snijders (1998) was studied in both testing environments. Simulating the distribution of l(z) for the two-parameter logistic model for conventional tests was studied. Two procedures for simulating the distribution of l(z) and l*(z) in a CAT were examined: (1) item scores were simulated with a fixed set of administered items; and (2) item scores were generated according to a stochastic design, where the choice of the administered item i + 1 depended on responses to previously administered items. The third study was a power study conducted to compare detection rates of l*(z) with l(z) for conventional tests. Results indicate that the distribution of l(z) differed from the theoretical distribution in conventional and CAT environments. In a conventional testing situation, the distribution of l(z) was in accord with the theoretical distribution, but for the CAT the distribution differed from the theoretical distribution. In the context of conventional testing, simulating the sampling distribution of l(z) for every examinee, based on theta, resulted in an appropriate approximation of the distribution. However, for the CAT environment, simulating the sampling distributions of both l(z) and l*(z) was problematic. Two appendixes show the derivation of the l*(z) statistic and discuss modeling local dependence.

AB - Several person-fit statistics have been proposed to detect item score patterns that do not fit an item response theory model. To classify response patterns as not fitting a model, a distribution of a person-fit statistic is needed. The null distributions of several fit statistics have been investigated using conventionally administered tests, but less is known about the distribution of fit statistics for computerized adaptive testing (CAT). A three-part simulation to study this distribution is described. First the theoretical distribution of the often used l(z) statistic across theta levels in a conventional testing and in CAT testing was studied, where theta and estimated theta were used to calculate l(z). Also, the distribution of a statistic l*(z), that is corrected for the error in theta, proposed by T. Snijders (1998) was studied in both testing environments. Simulating the distribution of l(z) for the two-parameter logistic model for conventional tests was studied. Two procedures for simulating the distribution of l(z) and l*(z) in a CAT were examined: (1) item scores were simulated with a fixed set of administered items; and (2) item scores were generated according to a stochastic design, where the choice of the administered item i + 1 depended on responses to previously administered items. The third study was a power study conducted to compare detection rates of l*(z) with l(z) for conventional tests. Results indicate that the distribution of l(z) differed from the theoretical distribution in conventional and CAT environments. In a conventional testing situation, the distribution of l(z) was in accord with the theoretical distribution, but for the CAT the distribution differed from the theoretical distribution. In the context of conventional testing, simulating the sampling distribution of l(z) for every examinee, based on theta, resulted in an appropriate approximation of the distribution. However, for the CAT environment, simulating the sampling distributions of both l(z) and l*(z) was problematic. Two appendixes show the derivation of the l*(z) statistic and discuss modeling local dependence.

KW - Scores

KW - Item Response Theory

KW - Test Items

KW - Responses

KW - Statistical Distributions

KW - METIS-136545

KW - Ability

KW - Adaptive Testing

KW - Foreign Countries

KW - Simulation

KW - IR-103775

KW - Power (Statistics)

M3 - Report

T3 - OMD research report

BT - Simulating the null distribution of person-fit statistics for conventional and adaptive tests

PB - Universiteit Twente TO/OMD

CY - Enschede

ER -

Meijer RR, van Krimpen-Stoop E. Simulating the null distribution of person-fit statistics for conventional and adaptive tests. Enschede: Universiteit Twente TO/OMD, 1998. 34 p. (OMD research report; 98-02).