Abstract
This article introduces a new test-centered standard-setting method as well as a procedure to detect intrajudge inconsistency of the method. The standard-setting method that is based on interdependent evaluations of alternative responses has judges closely evaluate the process that examinees use to solve multiple-choice items. The new method is analyzed against existing methods, particularly the Nedelsky and Angoff methods. Empirical results from three different experiments confirm the hypothesis that standards set by the new method are higher than those of the Nedelsky but lower than those of the Angoff method. The procedure for detecting intrajudge inconsistency is based on residual diagnosis of the judgments, which makes it possible to identify the sources of inconsistencies in the items, response alternatives, and/or judges. An empirical application of the procedure in an experiment with the new standard-setting method suggests that the method is internally consistent and has also revealed an interesting difference between residuals for the correct and incorrect alternatives.
Original language | Undefined |
---|---|
Pages (from-to) | 781-801 |
Number of pages | 20 |
Journal | Educational and psychological measurement |
Volume | 64 |
Issue number | 5 |
DOIs | |
Publication status | Published - 2004 |
Keywords
- Standard setting
- Nedelsky method
- polytomous response models
- judgmental item analysis
- intrajudge inconsistency
- METIS-219637
- multiple-choice test
- Angoff method
- IR-60141