TY - BOOK

T1 - Evaluating equating error in observed-score equating

AU - van der Linden, Willem J.

PY - 2006/7

Y1 - 2006/7

N2 - Traditionally, error in equating observed scores on two versions of a test is defined as the difference between the transformations that equate the quantiles of their distributions in the sample and in the population of examinees. This definition underlies, for example, the well-known approximation to the standard error of equating by Lord (1982). But it is argued that if the goal of equating is to adjust the scores of examinees on one version of the test to make them indistinguishable from those on another, equating error should be defined as the degree to which the equated scores realize this goal. Two equivalent definitions of equating error based on this criterion are formulated. It is shown how these definitions allow us to estimate the b ias and mean-squared error of any equating method if the response data fit an item-response theory model. An evaluation of the traditional equipercentile equating method and two new conditional methods for tests from a previous item pool of the Law School Admission Test (LSAT) shows that, under a variety of conditions, the equipercentile method tends to result in a serious bias and error, whereas the new methods are practically free of any error, except when the test to be equated has poorly discriminating items.

AB - Traditionally, error in equating observed scores on two versions of a test is defined as the difference between the transformations that equate the quantiles of their distributions in the sample and in the population of examinees. This definition underlies, for example, the well-known approximation to the standard error of equating by Lord (1982). But it is argued that if the goal of equating is to adjust the scores of examinees on one version of the test to make them indistinguishable from those on another, equating error should be defined as the degree to which the equated scores realize this goal. Two equivalent definitions of equating error based on this criterion are formulated. It is shown how these definitions allow us to estimate the b ias and mean-squared error of any equating method if the response data fit an item-response theory model. An evaluation of the traditional equipercentile equating method and two new conditional methods for tests from a previous item pool of the Law School Admission Test (LSAT) shows that, under a variety of conditions, the equipercentile method tends to result in a serious bias and error, whereas the new methods are practically free of any error, except when the test to be equated has poorly discriminating items.

KW - IR-104262

M3 - Report

T3 - LSAC research report series

BT - Evaluating equating error in observed-score equating

PB - Law School Admission Council

CY - Newton, PA, USA

ER -