Accurate and Reliable Classification of Unstructured Reports on Their Diagnostic Goal Using BERT Models

Max Tigo Rietberg, Van Bach Nguyen, Jeroen Geerdink, Onno Vijlbrief, Christin Seifert*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

4 Citations (Scopus)
34 Downloads (Pure)

Abstract

Understanding the diagnostic goal of medical reports is valuable information for understanding patient flows. This work focuses on extracting the reason for taking an MRI scan of Multiple Sclerosis (MS) patients using the attached free-form reports: Diagnosis, Progression or Monitoring. We investigate the performance of domain-dependent and general state-of-the-art language models and their alignment with domain expertise. To this end, eXplainable Artificial Intelligence (XAI) techniques are used to acquire insight into the inner workings of the model, which are verified on their trustworthiness. The verified XAI explanations are then compared with explanations from a domain expert, to indirectly determine the reliability of the model. BERTje, a Dutch Bidirectional Encoder Representations from Transformers (BERT) model, outperforms RobBERT and MedRoBERTa.nl in both accuracy and reliability. The latter model (MedRoBERTa.nl) is a domain-specific model, while BERTje is a generic model, showing that domain-specific models are not always superior. Our validation of BERTje in a small prospective study shows promising results for the potential uptake of the model in a practical setting.

Original languageEnglish
Article number1251
JournalDiagnostics
Volume13
Issue number7
Early online date27 Mar 2023
DOIs
Publication statusPublished - 1 Apr 2023

Keywords

  • BERT
  • Health informatics
  • Natural Language Processing (NLP)
  • Text classification

Fingerprint

Dive into the research topics of 'Accurate and Reliable Classification of Unstructured Reports on Their Diagnostic Goal Using BERT Models'. Together they form a unique fingerprint.

Cite this