Adaptive testing for making unidimensional and multidimensional classification decisions

M.M. van Groen

Research output: ThesisPhD Thesis - Research external, graduation UT

28 Downloads (Pure)

Abstract

Computerized adaptive tests (CATs) were originally developed to obtain an efficient estimate of the examinee’s ability, but they can also be used to classify the examinee into one of two or more levels (e.g. master/non-master). These computerized classification tests have the advantage that they can also be tailored to the individual student’s ability. Computerized classification tests require a method that decides whether testing can stop and which decision with the desired confidence can be made. Furthermore, a method to select the items is required. In classification testing for unidimensional constructs, items are often selected that attempt to measure optimal at either the cutoff point(s) or the student’s current ability estimate. Four methods were developed that combined the efficiency of the first approach with the adaptive item selection of the second approach. Their efficiency and accuracy was investigated using simulations. Several methods are available to make the classification decisions for constructs modeled with an unidimensional item response theory model. But if the construct is multidimensional, few classification methods are available. A classification method based on Wald’s Sequential Probability Ratio Test was developed for application to CAT with a multidimensional item response theory model in which each item measures multiple abilities. Seitz and Frey’s (2013) method to make classifications per dimension, when each item measures one dimension, was adapted to make classifications on the entire test and on parts of the test. Kingsbury and Weiss’s (1979) popular unidimensional classification method, which uses the confidence interval surrounding the ability estimate, was also adapted for multidimensional decisions. Simulation studies were used to investigate the efficiency and accuracy of the classification methods. Comparisons were made between different item selection methods, between different classification methods and between different settings for the classification methods. Tests can be used for formative assessment, formative evaluation, summative assessment, and summative evaluation. For seven types of tests, including computerized classification tests and educational games; the design, the possibility to adapt the test, and the possible use for each of the test goals was explored.
Original languageEnglish
Awarding Institution
  • University of Twente
Supervisors/Advisors
  • Eggen, Theo T.J.H.M., Supervisor
  • Veldkamp, Bernard P., Supervisor
Award date21 Nov 2014
Place of PublicationEnschede
Publisher
Print ISBNs978-94-6259-416-6
Publication statusPublished - 21 Nov 2014

Fingerprint

Testing
Students

Keywords

  • IR-92548
  • METIS-306396

Cite this

van Groen, M.M.. / Adaptive testing for making unidimensional and multidimensional classification decisions. Enschede : Universiteit Twente, 2014. 188 p.
@phdthesis{0127f8435086434eaac9127852f5acbc,
title = "Adaptive testing for making unidimensional and multidimensional classification decisions",
abstract = "Computerized adaptive tests (CATs) were originally developed to obtain an efficient estimate of the examinee’s ability, but they can also be used to classify the examinee into one of two or more levels (e.g. master/non-master). These computerized classification tests have the advantage that they can also be tailored to the individual student’s ability. Computerized classification tests require a method that decides whether testing can stop and which decision with the desired confidence can be made. Furthermore, a method to select the items is required. In classification testing for unidimensional constructs, items are often selected that attempt to measure optimal at either the cutoff point(s) or the student’s current ability estimate. Four methods were developed that combined the efficiency of the first approach with the adaptive item selection of the second approach. Their efficiency and accuracy was investigated using simulations. Several methods are available to make the classification decisions for constructs modeled with an unidimensional item response theory model. But if the construct is multidimensional, few classification methods are available. A classification method based on Wald’s Sequential Probability Ratio Test was developed for application to CAT with a multidimensional item response theory model in which each item measures multiple abilities. Seitz and Frey’s (2013) method to make classifications per dimension, when each item measures one dimension, was adapted to make classifications on the entire test and on parts of the test. Kingsbury and Weiss’s (1979) popular unidimensional classification method, which uses the confidence interval surrounding the ability estimate, was also adapted for multidimensional decisions. Simulation studies were used to investigate the efficiency and accuracy of the classification methods. Comparisons were made between different item selection methods, between different classification methods and between different settings for the classification methods. Tests can be used for formative assessment, formative evaluation, summative assessment, and summative evaluation. For seven types of tests, including computerized classification tests and educational games; the design, the possibility to adapt the test, and the possible use for each of the test goals was explored.",
keywords = "IR-92548, METIS-306396",
author = "{van Groen}, M.M.",
year = "2014",
month = "11",
day = "21",
language = "English",
isbn = "978-94-6259-416-6",
publisher = "Universiteit Twente",
school = "University of Twente",

}

Adaptive testing for making unidimensional and multidimensional classification decisions. / van Groen, M.M.

Enschede : Universiteit Twente, 2014. 188 p.

Research output: ThesisPhD Thesis - Research external, graduation UT

TY - THES

T1 - Adaptive testing for making unidimensional and multidimensional classification decisions

AU - van Groen, M.M.

PY - 2014/11/21

Y1 - 2014/11/21

N2 - Computerized adaptive tests (CATs) were originally developed to obtain an efficient estimate of the examinee’s ability, but they can also be used to classify the examinee into one of two or more levels (e.g. master/non-master). These computerized classification tests have the advantage that they can also be tailored to the individual student’s ability. Computerized classification tests require a method that decides whether testing can stop and which decision with the desired confidence can be made. Furthermore, a method to select the items is required. In classification testing for unidimensional constructs, items are often selected that attempt to measure optimal at either the cutoff point(s) or the student’s current ability estimate. Four methods were developed that combined the efficiency of the first approach with the adaptive item selection of the second approach. Their efficiency and accuracy was investigated using simulations. Several methods are available to make the classification decisions for constructs modeled with an unidimensional item response theory model. But if the construct is multidimensional, few classification methods are available. A classification method based on Wald’s Sequential Probability Ratio Test was developed for application to CAT with a multidimensional item response theory model in which each item measures multiple abilities. Seitz and Frey’s (2013) method to make classifications per dimension, when each item measures one dimension, was adapted to make classifications on the entire test and on parts of the test. Kingsbury and Weiss’s (1979) popular unidimensional classification method, which uses the confidence interval surrounding the ability estimate, was also adapted for multidimensional decisions. Simulation studies were used to investigate the efficiency and accuracy of the classification methods. Comparisons were made between different item selection methods, between different classification methods and between different settings for the classification methods. Tests can be used for formative assessment, formative evaluation, summative assessment, and summative evaluation. For seven types of tests, including computerized classification tests and educational games; the design, the possibility to adapt the test, and the possible use for each of the test goals was explored.

AB - Computerized adaptive tests (CATs) were originally developed to obtain an efficient estimate of the examinee’s ability, but they can also be used to classify the examinee into one of two or more levels (e.g. master/non-master). These computerized classification tests have the advantage that they can also be tailored to the individual student’s ability. Computerized classification tests require a method that decides whether testing can stop and which decision with the desired confidence can be made. Furthermore, a method to select the items is required. In classification testing for unidimensional constructs, items are often selected that attempt to measure optimal at either the cutoff point(s) or the student’s current ability estimate. Four methods were developed that combined the efficiency of the first approach with the adaptive item selection of the second approach. Their efficiency and accuracy was investigated using simulations. Several methods are available to make the classification decisions for constructs modeled with an unidimensional item response theory model. But if the construct is multidimensional, few classification methods are available. A classification method based on Wald’s Sequential Probability Ratio Test was developed for application to CAT with a multidimensional item response theory model in which each item measures multiple abilities. Seitz and Frey’s (2013) method to make classifications per dimension, when each item measures one dimension, was adapted to make classifications on the entire test and on parts of the test. Kingsbury and Weiss’s (1979) popular unidimensional classification method, which uses the confidence interval surrounding the ability estimate, was also adapted for multidimensional decisions. Simulation studies were used to investigate the efficiency and accuracy of the classification methods. Comparisons were made between different item selection methods, between different classification methods and between different settings for the classification methods. Tests can be used for formative assessment, formative evaluation, summative assessment, and summative evaluation. For seven types of tests, including computerized classification tests and educational games; the design, the possibility to adapt the test, and the possible use for each of the test goals was explored.

KW - IR-92548

KW - METIS-306396

M3 - PhD Thesis - Research external, graduation UT

SN - 978-94-6259-416-6

PB - Universiteit Twente

CY - Enschede

ER -

van Groen MM. Adaptive testing for making unidimensional and multidimensional classification decisions. Enschede: Universiteit Twente, 2014. 188 p.