Methods are presented for comparing grades obtained in a situation where students can choose between different subjects. It must be expected that the comparison between the grades is complicated by the interaction between the students' pattern and level of proficiency on one hand, and the choice of the subjects on the other hand. Three methods based on item response theory (IRT) for the estimation of proficiency measures that are comparable over students and subjects are discussed: a method based on a model with a unidimensional representation of proficiency, a method based on a model with a multidimensional representation of proficiency, and a method based on a multidimensional representation of proficiency where the stochastic nature of the choice of examination subjects is explicitly modeled. The methods are compared using the data from the Central Examinations in Secondary Education in the Netherlands. The results show that the unidimensional IRT model produces unrealistic results, which do not appear when using the two multidimensional IRT models. Further, it is shown that both the multidimensional models produce acceptable model fit. However, the model that explicitly takes the choice process into account produces the best model fit.