TY - UNPB
T1 - Case-level Breast Cancer Prediction for Real Hospital Settings
AU - Pathak, Shreyasi
AU - Schlötterer, Jörg
AU - Geerdink, Jeroen
AU - Veltman, Jeroen
AU - van Keulen, Maurice
AU - Strisciuglio, Nicola
AU - Seifert, Christin
N1 - 31 pages, 15 figures, 12 tables
PY - 2023/10/19
Y1 - 2023/10/19
N2 - Breast cancer prediction models for mammography assume that annotations are available for individual images or regions of interest (ROIs), and that there is a fixed number of images per patient. These assumptions do not hold in real hospital settings, where clinicians provide only a final diagnosis for the entire mammography exam (case). Since data in real hospital settings scales with continuous patient intake, while manual annotation efforts do not, we develop a framework for case-level breast cancer prediction that does not require any manual annotation and can be trained with case labels readily available at the hospital. Specifically, we propose a two-level multi-instance learning (MIL) approach at patch and image level for case-level breast cancer prediction and evaluate it on two public and one private dataset. We propose a novel domain-specific MIL pooling observing that breast cancer may or may not occur in both sides, while images of both breasts are taken as a precaution during mammography. We propose a dynamic training procedure for training our MIL framework on a variable number of images per case. We show that our two-level MIL model can be applied in real hospital settings where only case labels, and a variable number of images per case are available, without any loss in performance compared to models trained on image labels. Only trained with weak (case-level) labels, it has the capability to point out in which breast side, mammography view and view region the abnormality lies.
AB - Breast cancer prediction models for mammography assume that annotations are available for individual images or regions of interest (ROIs), and that there is a fixed number of images per patient. These assumptions do not hold in real hospital settings, where clinicians provide only a final diagnosis for the entire mammography exam (case). Since data in real hospital settings scales with continuous patient intake, while manual annotation efforts do not, we develop a framework for case-level breast cancer prediction that does not require any manual annotation and can be trained with case labels readily available at the hospital. Specifically, we propose a two-level multi-instance learning (MIL) approach at patch and image level for case-level breast cancer prediction and evaluate it on two public and one private dataset. We propose a novel domain-specific MIL pooling observing that breast cancer may or may not occur in both sides, while images of both breasts are taken as a precaution during mammography. We propose a dynamic training procedure for training our MIL framework on a variable number of images per case. We show that our two-level MIL model can be applied in real hospital settings where only case labels, and a variable number of images per case are available, without any loss in performance compared to models trained on image labels. Only trained with weak (case-level) labels, it has the capability to point out in which breast side, mammography view and view region the abnormality lies.
KW - cs.CV
U2 - 10.48550/arXiv.2310.12677
DO - 10.48550/arXiv.2310.12677
M3 - Preprint
BT - Case-level Breast Cancer Prediction for Real Hospital Settings
PB - ArXiv.org
ER -