TY - JOUR
T1 - Radiology report generation for proximal femur fractures using deep classification and language generation models
AU - Paalvast, Olivier
AU - Nauta, Meike
AU - Koelle, Marion
AU - Geerdink, Jeroen
AU - Vijlbrief, Onno
AU - Hegeman, Johannes H.
AU - Seifert, Christin
N1 - Publisher Copyright:
© 2022 The Authors
PY - 2022/6
Y1 - 2022/6
N2 - Proximal femur fractures represent a major health concern, and substantially contribute to the morbidity of elderly. Correct classification and diagnosis of hip fractures has a significant impact on mortality, costs and hospital stay. In this paper, we present a method and empirical validation for automatic subclassification of proximal femur fractures and Dutch radiological report generation that does not rely on manually curated data. The fracture classification model was trained on 11,000 X-ray images obtained from 5000 electronic health records in a general hospital. To generate the Dutch reports, we first trained an embedding model on 20,000 radiological reports of pelvic region fractures, and used its embeddings in the report generation model. We trained the report generation model on the 5000 radiological reports associated with the fracture cases. Our report generation model is on par with state-of-the-art in terms of BLEU and ROUGE scores. This is promising, because in contrast to those earlier works, our approach does not require manual preprocessing of either images or the reports. This boosts the applicability of automatic clinical report generation in practice. A quantitative and qualitative user study among medical students found no significant difference in provenance of real and generated reports. A qualitative, in-depth clinical relevance study with medical domain experts showed that from a human perspective the quality of the generated reports approximates the quality of the original reports and highlights challenges in creating sufficiently detailed and versatile training data for automatic radiology report generation.
AB - Proximal femur fractures represent a major health concern, and substantially contribute to the morbidity of elderly. Correct classification and diagnosis of hip fractures has a significant impact on mortality, costs and hospital stay. In this paper, we present a method and empirical validation for automatic subclassification of proximal femur fractures and Dutch radiological report generation that does not rely on manually curated data. The fracture classification model was trained on 11,000 X-ray images obtained from 5000 electronic health records in a general hospital. To generate the Dutch reports, we first trained an embedding model on 20,000 radiological reports of pelvic region fractures, and used its embeddings in the report generation model. We trained the report generation model on the 5000 radiological reports associated with the fracture cases. Our report generation model is on par with state-of-the-art in terms of BLEU and ROUGE scores. This is promising, because in contrast to those earlier works, our approach does not require manual preprocessing of either images or the reports. This boosts the applicability of automatic clinical report generation in practice. A quantitative and qualitative user study among medical students found no significant difference in provenance of real and generated reports. A qualitative, in-depth clinical relevance study with medical domain experts showed that from a human perspective the quality of the generated reports approximates the quality of the original reports and highlights challenges in creating sufficiently detailed and versatile training data for automatic radiology report generation.
KW - Fracture classification
KW - Proximal femur fractures
KW - Radiology language model
KW - Radiology report generation
KW - User study
UR - http://www.scopus.com/inward/record.url?scp=85128847465&partnerID=8YFLogxK
U2 - 10.1016/j.artmed.2022.102281
DO - 10.1016/j.artmed.2022.102281
M3 - Article
AN - SCOPUS:85128847465
SN - 0933-3657
VL - 128
JO - Artificial intelligence in medicine
JF - Artificial intelligence in medicine
M1 - 102281
ER -