TY - BOOK
T1 - Evaluating equating error in observed-score equating
AU - van der Linden, Willem J.
PY - 2006/7
Y1 - 2006/7
N2 - Traditionally, error in equating observed scores on two versions of a test is defined as the difference between the transformations that equate the quantiles of their distributions in the sample and in the population of examinees. This definition underlies, for example, the well-known approximation to the standard error of equating by Lord (1982). But it is argued that if the goal of equating is to adjust the scores of examinees on one version of the test to make them indistinguishable from those on another, equating error should be defined as the degree to which the equated scores realize this goal. Two equivalent definitions of equating error based on this criterion are formulated. It is shown how these definitions allow us to estimate the b ias and mean-squared error of any equating method if the response data fit an item-response theory model. An evaluation of the traditional equipercentile equating method and two new conditional methods for tests from a previous item pool of the Law School Admission Test (LSAT) shows that, under a variety of conditions, the equipercentile method tends to result in a serious bias and error, whereas the new methods are practically free of any error, except when the test to be equated has poorly discriminating items.
AB - Traditionally, error in equating observed scores on two versions of a test is defined as the difference between the transformations that equate the quantiles of their distributions in the sample and in the population of examinees. This definition underlies, for example, the well-known approximation to the standard error of equating by Lord (1982). But it is argued that if the goal of equating is to adjust the scores of examinees on one version of the test to make them indistinguishable from those on another, equating error should be defined as the degree to which the equated scores realize this goal. Two equivalent definitions of equating error based on this criterion are formulated. It is shown how these definitions allow us to estimate the b ias and mean-squared error of any equating method if the response data fit an item-response theory model. An evaluation of the traditional equipercentile equating method and two new conditional methods for tests from a previous item pool of the Law School Admission Test (LSAT) shows that, under a variety of conditions, the equipercentile method tends to result in a serious bias and error, whereas the new methods are practically free of any error, except when the test to be equated has poorly discriminating items.
KW - IR-104262
M3 - Report
T3 - LSAC research report series
BT - Evaluating equating error in observed-score equating
PB - Law School Admission Council
CY - Newton, PA, USA
ER -