Using a stepwise approach to simultaneously develop and validate machine learning based prediction models

M. Haalboom*, S. Kort, J. van der Palen

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

2 Citations (Scopus)
1 Downloads (Pure)


Accurate diagnosis of a disease is essential in healthcare. Prediction models, based on classical regression techniques, are widely used in clinical practice. Machine Learning (ML) techniques might be preferred in case of a large amount of data per patient and relatively limited numbers of subjects. However, this increases the risk of overfitting, and external validation is imperative. However, in the field of ML, new and more efficient techniques are developed rapidly, and if recruiting patients for a validation study is time consuming, the ML technique used to develop the first model might have been surpassed by more efficient ML techniques, rendering this original model no longer relevant. We demonstrate a stepwise design for simultaneous development and validation of prediction models based on ML techniques. The design enables – in one study - evaluation of the stability and robustness of a prediction model over increasing sample size as well as assessment of the stability of sensitivity/specificity at a chosen cut-off. This will shorten the time to introduction of a new test in health care. We finally describe how to use regular clinical parameters in conjunction with ML based predictions, to further enhance differentiation between subjects with and without a disease.

Original languageEnglish
Pages (from-to)305-310
Number of pages6
JournalJournal of clinical epidemiology
Early online date19 Jun 2021
Publication statusPublished - 1 Feb 2022
Externally publishedYes


  • Diagnostic accuracy
  • Machine learning
  • Model stability
  • Prediction model
  • Validation


Dive into the research topics of 'Using a stepwise approach to simultaneously develop and validate machine learning based prediction models'. Together they form a unique fingerprint.

Cite this