Computerized adaptive tests (CATs) were originally developed to obtain an efficient estimate of the examinee’s ability, but they can also be used to classify the examinee into one of two or more levels (e.g. master/non-master). These computerized classification tests have the advantage that they can also be tailored to the individual student’s ability. Computerized classification tests require a method that decides whether testing can stop and which decision with the desired confidence can be made. Furthermore, a method to select the items is required. In classification testing for unidimensional constructs, items are often selected that attempt to measure optimal at either the cutoff point(s) or the student’s current ability estimate. Four methods were developed that combined the efficiency of the first approach with the adaptive item selection of the second approach. Their efficiency and accuracy was investigated using simulations. Several methods are available to make the classification decisions for constructs modeled with an unidimensional item response theory model. But if the construct is multidimensional, few classification methods are available. A classification method based on Wald’s Sequential Probability Ratio Test was developed for application to CAT with a multidimensional item response theory model in which each item measures multiple abilities. Seitz and Frey’s (2013) method to make classifications per dimension, when each item measures one dimension, was adapted to make classifications on the entire test and on parts of the test. Kingsbury and Weiss’s (1979) popular unidimensional classification method, which uses the confidence interval surrounding the ability estimate, was also adapted for multidimensional decisions. Simulation studies were used to investigate the efficiency and accuracy of the classification methods. Comparisons were made between different item selection methods, between different classification methods and between different settings for the classification methods. Tests can be used for formative assessment, formative evaluation, summative assessment, and summative evaluation. For seven types of tests, including computerized classification tests and educational games; the design, the possibility to adapt the test, and the possible use for each of the test goals was explored.
|Award date||21 Nov 2014|
|Place of Publication||Enschede|
|Publication status||Published - 21 Nov 2014|