Traditionally, error in equating observed scores on two versions of a test is defined as the difference between the transformations that equate the quantiles of their distributions in the sample and in the population of examinees. This definition underlies, for example, the well-known approximation to the standard error of equating by Lord (1982).However, it is argued that if the goal of equating is to adjust the scores of examinees on one version of the test to make them indistinguishable from those on another, equating error should be defined as the degree to which the equated scores realize this goal. Two equivalent definitions of equating error based on this criterion are formulated. These definitions can be used to evaluated existing equating methods and derive new methods if the response data fit an item-response theory model. An evaluation of the traditional equipercentile equating method and two new conditional methods for tests from a previous item pool of the Law School Admission Test showed that, under avariety of conditions, the equipercentile method tends to result in a serious bias in the equated scores, while the new methods are practically free of any bias.
|Name||OMD Research Report|
|Publisher||University of Twente, Faculty of Educational Science and Technology|