Background Multi-item questionnaires are important instruments for monitoring health in epidemiological longitudinal studies. Mostly sum-scores are used as a summary measure for these multi-item questionnaires. The objective of this study was to show the negative impact of using sum-score based longitudinal data analysis instead of Item Response Theory (IRT)-based plausible values. Methods In a simulation study (varying the number of items, sample size, and distribution of the outcomes) the parameter estimates resulting from both modeling techniques were compared to the true values. Next, the models were applied to an example dataset from the Amsterdam Growth and Health Longitudinal Study (AGHLS). Results The results show that using sum-scores leads to overestimation of the within person (repeated measurement) variance and underestimation of the between person variance. Conclusions We recommend using IRT-based plausible value techniques for analyzing repeatedly measured multi-item questionnaire data.