Post-Structuring Radiology Reports of Breast Cancer Patients for Clinical Quality Assurance

Shreyasi Pathak, Jorit van Rossen, Onno Vijlbrief, Jeroen Geerdink, Christin Seifert, Maurice van Keulen

    Research output: Contribution to journalArticleAcademicpeer-review

    16 Downloads (Pure)

    Abstract

    Hospitals often set protocols based on well defined standards to maintain the quality of patient reports. To ensure that the clinicians conform to the protocols, quality assurance of these reports is needed. Patient reports are currently written in free-text format, which complicates the task of quality assurance. In this paper, we present a machine learning based natural language processing system for automatic quality assurance of radiology reports on breast cancer. This is achieved in three steps: we i) identify the top-level structure (headings) of the report, ii) classify the report content into the top-level headings, iii) convert the free-text detailed findings in the report to a semi-structured format (post-structuring). Top level structure and content of report were predicted with an F1 score of 0.97 and 0.94, respectively using Support Vector Machine (SVM) classifiers. For automatic structuring, our proposed hierarchical Conditional Random Field (CRF) outperformed the baseline CRF with an F1 score of 0.78 vs 0.71. The determined structure of the report is represented in semi-structured XML format of the free-text report, which helps to easily visualize the conformance of the findings to the protocols. This format also allows easy extraction of specific information for other purposes such as search, evaluation and research.
    Original languageEnglish
    JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
    DOIs
    Publication statusE-pub ahead of print/First online - 3 May 2019

    Fingerprint

    Radiology
    Quality Assurance
    Quality assurance
    Breast Cancer
    Conditional Random Fields
    Natural Language Processing
    Breast Neoplasms
    Network protocols
    Information Storage and Retrieval
    Natural language processing systems
    XML
    Natural Language
    Convert
    Support vector machines
    Learning systems
    Well-defined
    Baseline
    Support Vector Machine
    Machine Learning
    Classifiers

    Cite this

    @article{c01100c917b9455d9e8dd7c02724c2c5,
    title = "Post-Structuring Radiology Reports of Breast Cancer Patients for Clinical Quality Assurance",
    abstract = "Hospitals often set protocols based on well defined standards to maintain the quality of patient reports. To ensure that the clinicians conform to the protocols, quality assurance of these reports is needed. Patient reports are currently written in free-text format, which complicates the task of quality assurance. In this paper, we present a machine learning based natural language processing system for automatic quality assurance of radiology reports on breast cancer. This is achieved in three steps: we i) identify the top-level structure (headings) of the report, ii) classify the report content into the top-level headings, iii) convert the free-text detailed findings in the report to a semi-structured format (post-structuring). Top level structure and content of report were predicted with an F1 score of 0.97 and 0.94, respectively using Support Vector Machine (SVM) classifiers. For automatic structuring, our proposed hierarchical Conditional Random Field (CRF) outperformed the baseline CRF with an F1 score of 0.78 vs 0.71. The determined structure of the report is represented in semi-structured XML format of the free-text report, which helps to easily visualize the conformance of the findings to the protocols. This format also allows easy extraction of specific information for other purposes such as search, evaluation and research.",
    author = "Shreyasi Pathak and {van Rossen}, Jorit and Onno Vijlbrief and Jeroen Geerdink and Christin Seifert and {van Keulen}, Maurice",
    year = "2019",
    month = "5",
    day = "3",
    doi = "10.1109/TCBB.2019.2914678",
    language = "English",
    journal = "IEEE/ACM Transactions on Computational Biology and Bioinformatics",
    issn = "1545-5963",
    publisher = "IEEE",

    }

    Post-Structuring Radiology Reports of Breast Cancer Patients for Clinical Quality Assurance. / Pathak, Shreyasi ; van Rossen, Jorit; Vijlbrief, Onno; Geerdink, Jeroen; Seifert, Christin ; van Keulen, Maurice .

    In: IEEE/ACM Transactions on Computational Biology and Bioinformatics, 03.05.2019.

    Research output: Contribution to journalArticleAcademicpeer-review

    TY - JOUR

    T1 - Post-Structuring Radiology Reports of Breast Cancer Patients for Clinical Quality Assurance

    AU - Pathak, Shreyasi

    AU - van Rossen, Jorit

    AU - Vijlbrief, Onno

    AU - Geerdink, Jeroen

    AU - Seifert, Christin

    AU - van Keulen, Maurice

    PY - 2019/5/3

    Y1 - 2019/5/3

    N2 - Hospitals often set protocols based on well defined standards to maintain the quality of patient reports. To ensure that the clinicians conform to the protocols, quality assurance of these reports is needed. Patient reports are currently written in free-text format, which complicates the task of quality assurance. In this paper, we present a machine learning based natural language processing system for automatic quality assurance of radiology reports on breast cancer. This is achieved in three steps: we i) identify the top-level structure (headings) of the report, ii) classify the report content into the top-level headings, iii) convert the free-text detailed findings in the report to a semi-structured format (post-structuring). Top level structure and content of report were predicted with an F1 score of 0.97 and 0.94, respectively using Support Vector Machine (SVM) classifiers. For automatic structuring, our proposed hierarchical Conditional Random Field (CRF) outperformed the baseline CRF with an F1 score of 0.78 vs 0.71. The determined structure of the report is represented in semi-structured XML format of the free-text report, which helps to easily visualize the conformance of the findings to the protocols. This format also allows easy extraction of specific information for other purposes such as search, evaluation and research.

    AB - Hospitals often set protocols based on well defined standards to maintain the quality of patient reports. To ensure that the clinicians conform to the protocols, quality assurance of these reports is needed. Patient reports are currently written in free-text format, which complicates the task of quality assurance. In this paper, we present a machine learning based natural language processing system for automatic quality assurance of radiology reports on breast cancer. This is achieved in three steps: we i) identify the top-level structure (headings) of the report, ii) classify the report content into the top-level headings, iii) convert the free-text detailed findings in the report to a semi-structured format (post-structuring). Top level structure and content of report were predicted with an F1 score of 0.97 and 0.94, respectively using Support Vector Machine (SVM) classifiers. For automatic structuring, our proposed hierarchical Conditional Random Field (CRF) outperformed the baseline CRF with an F1 score of 0.78 vs 0.71. The determined structure of the report is represented in semi-structured XML format of the free-text report, which helps to easily visualize the conformance of the findings to the protocols. This format also allows easy extraction of specific information for other purposes such as search, evaluation and research.

    U2 - 10.1109/TCBB.2019.2914678

    DO - 10.1109/TCBB.2019.2914678

    M3 - Article

    JO - IEEE/ACM Transactions on Computational Biology and Bioinformatics

    JF - IEEE/ACM Transactions on Computational Biology and Bioinformatics

    SN - 1545-5963

    ER -