Automatic structuring of breast cancer radiology reports for quality assurance

Shreyasi Pathak, Jorit van Rossen, Onno Vijlbrief, Jeroen Geerdink, Christin Seifert, Maurice van Keulen

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    1 Citation (Scopus)
    82 Downloads (Pure)

    Abstract

    Hospitals often set protocols based on well defined standards to maintain quality of patient reports. To ensure that the clinicians conform to the protocols, quality assurance of these reports is needed. Patient reports are currently written in free-text format, which complicates the task of quality assurance. In this paper, we present a machine learning based natural language processing system for automatic quality assurance of radiology reports on breast cancer. This is achieved in three steps: We i) identify the top level structure of the report, ii) check whether the information under each section corresponds to the section heading, iii) convert the free-text detailed findings in the report to a semi-structured format. Top level structure and content of report were predicted with an F1 score of 0.97 and 0.94 respectively using Support Vector Machine (SVM). For automatic structuring, our proposed hierarchical Conditional Random Field (CRF) outperformed the baseline CRF with an F1 score of 0.78 vs 0.71. The third step generates a semi-structured XML format of the free-text report, which helps to easily visualize the conformance of the findings to the protocols. This format also allows easy extraction of specific information for other purposes such as search, evaluation and research.

    Original languageEnglish
    Title of host publicationProceedings of the Workshop on Data Mining in Biomedical Informatics and Healthcare (DMBIH 2018)
    EditorsJeffrey Yu, Zhenhui Li, Hanghang Tong, Feida Zhu
    PublisherIEEE Computer Society
    Pages732-739
    Number of pages8
    Volume2018-November
    ISBN (Electronic)9781538692882
    DOIs
    Publication statusPublished - 17 Nov 2018
    Event6th Workshop on Data Mining in Biomedical Informatics and Healthcare 2018 - Singapore, Singapore
    Duration: 17 Nov 201817 Nov 2018
    Conference number: 6
    http://facweb.cs.depaul.edu/research/vc/ICDM18/index.html

    Workshop

    Workshop6th Workshop on Data Mining in Biomedical Informatics and Healthcare 2018
    Abbreviated titleDMBIH 2018
    CountrySingapore
    CitySingapore
    Period17/11/1817/11/18
    Internet address

    Keywords

    • Quality Assurance
    • Automatic Structuring
    • Radiology
    • Natural Language Processing
    • Conditional Random Fields

    Fingerprint Dive into the research topics of 'Automatic structuring of breast cancer radiology reports for quality assurance'. Together they form a unique fingerprint.

    Cite this