Tests, surveys and questionnaires are all around us these days, and there is an increasing interest in comparing the resulting scores: between countries, between males and females, or over measurement occasions. In the design and analysis of such measurement instruments, a major concern is that the items should measure the construct in the same way in all groups the instrument is intended for. When there are differences between groups in for example the difficulty of test items or the relevance of items for measuring an underlying attitude, the resulting scores are not directly comparable. In this thesis, the use of Bayesian Item Response Theory (IRT) models is investigated to model variance in the way measurement instruments function over groups. Models are developed in which the characteristics of items for each group are modeled hierarchically as deviations from general item characteristics. In contrast to traditional methods, no invariant anchor items are required and the method is easily applied in situations with a large number of groups. Extensions with explanatory information about item, person, and group characteristics and with longitudinal growth structures are incorporated within this framework. Bayesian Markov Chain Monte Carlo (MCMC) methods are used to estimate the complex models. To identify which of the items have invariant characteristics over groups, invariance tests based on the Bayes factor for nested models and Deviance Information Criteria are developed. The result is a comprehensive framework which includes both tests to diagnose whether measurement instruments function differently across groups, and models which take these differences into account to enable valid score comparisons and to gain insight into the nature of these differences.
|Award date||16 Nov 2012|
|Place of Publication||Enschede|
|Publication status||Published - 16 Nov 2012|