Abstract
Introduction
Early diagnosis and improved treatment has increased the number of breast cancer survivors. This increase results in more people struggling with long-term effects of cancer and its treatment. One of these effects is cancer-related fatigue (CRF). It is important to recognize CRF in time to prevent it from worsening and becoming chronic by starting a CRF intervention. Using machine learning models, we aimed to predict the individual risk of developing CRF.
Methods
Data from the Primary Secondary Cancer Care Registry (PSCCR) was used, in which information of the Netherlands Cancer Registry (NCR) was combined with data of General Practitioners (GPs) via Nivel Primary Care. We included 12.813 breast cancer patients of which 2.224 visited the GP with fatigue complaints. Predictors (n=64) were related to patient, tumour and treatment characteristics and GP visits before diagnosis. Missing data was imputed using Multiple Imputation by Chained Equations and risk was predicted using Random Forest Classifier, Logistic Regression, Gaussian Naïve Bayes, K-Nearest Neighbours and Multi-Layer Perceptron. A nested 5-fold cross validation was used to optimize hyperparameters and assess the performance of the models by comparing the area under the receiver operator characteristic curve (AUC-score).
Results
The performance of the models was poor to moderate, with AUC-scores ranging from 0.54-0.63. The Random Forest Classifier and the Logistic Regression model performed best, with AUC-scores of 0.63±0.014 and 0.62±0.09 respectively.
Conclusion
Using machine learning models on the PSCCR dataset, the individual risk for CRF cannot be predicted accurately. This can be due to the assessment of fatigue as outcome measure. Not all patients with fatigue complaints visit the GP and not all fatigue complaints might be related to the breast cancer diagnosis. In future studies, we hope to collect more detailed data and have a clearer differentiation between fatigued and non-fatigued patients.
Early diagnosis and improved treatment has increased the number of breast cancer survivors. This increase results in more people struggling with long-term effects of cancer and its treatment. One of these effects is cancer-related fatigue (CRF). It is important to recognize CRF in time to prevent it from worsening and becoming chronic by starting a CRF intervention. Using machine learning models, we aimed to predict the individual risk of developing CRF.
Methods
Data from the Primary Secondary Cancer Care Registry (PSCCR) was used, in which information of the Netherlands Cancer Registry (NCR) was combined with data of General Practitioners (GPs) via Nivel Primary Care. We included 12.813 breast cancer patients of which 2.224 visited the GP with fatigue complaints. Predictors (n=64) were related to patient, tumour and treatment characteristics and GP visits before diagnosis. Missing data was imputed using Multiple Imputation by Chained Equations and risk was predicted using Random Forest Classifier, Logistic Regression, Gaussian Naïve Bayes, K-Nearest Neighbours and Multi-Layer Perceptron. A nested 5-fold cross validation was used to optimize hyperparameters and assess the performance of the models by comparing the area under the receiver operator characteristic curve (AUC-score).
Results
The performance of the models was poor to moderate, with AUC-scores ranging from 0.54-0.63. The Random Forest Classifier and the Logistic Regression model performed best, with AUC-scores of 0.63±0.014 and 0.62±0.09 respectively.
Conclusion
Using machine learning models on the PSCCR dataset, the individual risk for CRF cannot be predicted accurately. This can be due to the assessment of fatigue as outcome measure. Not all patients with fatigue complaints visit the GP and not all fatigue complaints might be related to the breast cancer diagnosis. In future studies, we hope to collect more detailed data and have a clearer differentiation between fatigued and non-fatigued patients.
Original language | English |
---|---|
Publication status | Published - Jun 2022 |
Event | WEON 2022: The art of epidemiology - Nijmegen, Netherlands Duration: 9 Jun 2022 → 10 Jun 2022 |
Conference
Conference | WEON 2022 |
---|---|
Country/Territory | Netherlands |
City | Nijmegen |
Period | 9/06/22 → 10/06/22 |