Prediction of cancer-related fatigue using multiple machine learning models

Research output: Contribution to conferenceAbstractAcademic


Early diagnosis and improved treatment has increased the number of breast cancer survivors. This increase results in more people struggling with long-term effects of cancer and its treatment. One of these effects is cancer-related fatigue (CRF). It is important to recognize CRF in time to prevent it from worsening and becoming chronic by starting a CRF intervention. Using machine learning models, we aimed to predict the individual risk of developing CRF.
Data from the Primary Secondary Cancer Care Registry (PSCCR) was used, in which information of the Netherlands Cancer Registry (NCR) was combined with data of General Practitioners (GPs) via Nivel Primary Care. We included 12.813 breast cancer patients of which 2.224 visited the GP with fatigue complaints. Predictors (n=64) were related to patient, tumour and treatment characteristics and GP visits before diagnosis. Missing data was imputed using Multiple Imputation by Chained Equations and risk was predicted using Random Forest Classifier, Logistic Regression, Gaussian Naïve Bayes, K-Nearest Neighbours and Multi-Layer Perceptron. A nested 5-fold cross validation was used to optimize hyperparameters and assess the performance of the models by comparing the area under the receiver operator characteristic curve (AUC-score).
The performance of the models was poor to moderate, with AUC-scores ranging from 0.54-0.63. The Random Forest Classifier and the Logistic Regression model performed best, with AUC-scores of 0.63±0.014 and 0.62±0.09 respectively.
Using machine learning models on the PSCCR dataset, the individual risk for CRF cannot be predicted accurately. This can be due to the assessment of fatigue as outcome measure. Not all patients with fatigue complaints visit the GP and not all fatigue complaints might be related to the breast cancer diagnosis. In future studies, we hope to collect more detailed data and have a clearer differentiation between fatigued and non-fatigued patients.
Original languageEnglish
Publication statusPublished - Jun 2022
EventWEON 2022: The art of epidemiology - Nijmegen, Netherlands
Duration: 9 Jun 202210 Jun 2022


ConferenceWEON 2022


Dive into the research topics of 'Prediction of cancer-related fatigue using multiple machine learning models'. Together they form a unique fingerprint.

Cite this