Radiology report generation for proximal femur fractures using deep classification and language generation models

Olivier Paalvast*, Meike Nauta, Marion Koelle, Jeroen Geerdink, Onno Vijlbrief, Johannes H. Hegeman, Christin Seifert

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

46 Downloads (Pure)


Proximal femur fractures represent a major health concern, and substantially contribute to the morbidity of elderly. Correct classification and diagnosis of hip fractures has a significant impact on mortality, costs and hospital stay. In this paper, we present a method and empirical validation for automatic subclassification of proximal femur fractures and Dutch radiological report generation that does not rely on manually curated data. The fracture classification model was trained on 11,000 X-ray images obtained from 5000 electronic health records in a general hospital. To generate the Dutch reports, we first trained an embedding model on 20,000 radiological reports of pelvic region fractures, and used its embeddings in the report generation model. We trained the report generation model on the 5000 radiological reports associated with the fracture cases. Our report generation model is on par with state-of-the-art in terms of BLEU and ROUGE scores. This is promising, because in contrast to those earlier works, our approach does not require manual preprocessing of either images or the reports. This boosts the applicability of automatic clinical report generation in practice. A quantitative and qualitative user study among medical students found no significant difference in provenance of real and generated reports. A qualitative, in-depth clinical relevance study with medical domain experts showed that from a human perspective the quality of the generated reports approximates the quality of the original reports and highlights challenges in creating sufficiently detailed and versatile training data for automatic radiology report generation.

Original languageEnglish
Article number102281
Number of pages14
JournalArtificial intelligence in medicine
Early online date26 Mar 2022
Publication statusPublished - Jun 2022


  • Fracture classification
  • Proximal femur fractures
  • Radiology language model
  • Radiology report generation
  • User study


Dive into the research topics of 'Radiology report generation for proximal femur fractures using deep classification and language generation models'. Together they form a unique fingerprint.

Cite this