Validation of multisource electronic health record data: An application to blood transfusion data

Loan R. van Hoeven, Martine C.De Bruijne, Peter F. Kemper, Maria M.W. Koopman, Jan M.M. Rondeel, Anja Leyte, Hendrik Koffijberg, Mart P. Janssen, Kit C.B. Roes

Research output: Contribution to journalArticleAcademicpeer-review

1 Citation (Scopus)
38 Downloads (Pure)

Abstract

Background: Although data from electronic health records (EHR) are often used for research purposes, systematic validation of these data prior to their use is not standard practice. Existing validation frameworks discuss validity concepts without translating these into practical implementation steps or addressing the potential influence of linking multiple sources. Therefore we developed a practical approach for validating routinely collected data from multiple sources and to apply it to a blood transfusion data warehouse to evaluate the usability in practice.

Methods: The approach consists of identifying existing validation frameworks for EHR data or linked data, selecting validity concepts from these frameworks and establishing quantifiable validity outcomes for each concept. The approach distinguishes external validation concepts (e.g. concordance with external reports, previous literature and expert feedback) and internal consistency concepts which use expected associations within the dataset itself (e.g. completeness, uniformity and plausibility). In an example case, the selected concepts were applied to a transfusion dataset and specified in more detail.

Results: Application of the approach to a transfusion dataset resulted in a structured overview of data validity aspects. This allowed improvement of these aspects through further processing of the data and in some cases adjustment of the data extraction. For example, the proportion of transfused products that could not be linked to the corresponding issued products initially was 2.2% but could be improved by adjusting data extraction criteria to 0.17%.

Conclusions: This stepwise approach for validating linked multisource data provides a basis for evaluating data quality and enhancing interpretation. When the process of data validation is adopted more broadly, this contributes to increased transparency and greater reliability of research based on routinely collected electronic health records.

Original languageEnglish
Article number107
JournalBMC medical informatics and decision making
Volume17
Issue number1
DOIs
Publication statusPublished - 14 Jul 2017

Fingerprint

Electronic Health Records
Blood Transfusion
Information Storage and Retrieval
Research
Research Design
Datasets

Keywords

  • Data quality
  • Data validation
  • Linkage of multiple sources
  • Routinely collected data

Cite this

van Hoeven, L. R., Bruijne, M. C. D., Kemper, P. F., Koopman, M. M. W., Rondeel, J. M. M., Leyte, A., ... Roes, K. C. B. (2017). Validation of multisource electronic health record data: An application to blood transfusion data. BMC medical informatics and decision making, 17(1), [107]. https://doi.org/10.1186/s12911-017-0504-7
van Hoeven, Loan R. ; Bruijne, Martine C.De ; Kemper, Peter F. ; Koopman, Maria M.W. ; Rondeel, Jan M.M. ; Leyte, Anja ; Koffijberg, Hendrik ; Janssen, Mart P. ; Roes, Kit C.B. / Validation of multisource electronic health record data : An application to blood transfusion data. In: BMC medical informatics and decision making. 2017 ; Vol. 17, No. 1.
@article{9ef1ff64a78145bfa60c1aa0d679b7c8,
title = "Validation of multisource electronic health record data: An application to blood transfusion data",
abstract = "Background: Although data from electronic health records (EHR) are often used for research purposes, systematic validation of these data prior to their use is not standard practice. Existing validation frameworks discuss validity concepts without translating these into practical implementation steps or addressing the potential influence of linking multiple sources. Therefore we developed a practical approach for validating routinely collected data from multiple sources and to apply it to a blood transfusion data warehouse to evaluate the usability in practice.Methods: The approach consists of identifying existing validation frameworks for EHR data or linked data, selecting validity concepts from these frameworks and establishing quantifiable validity outcomes for each concept. The approach distinguishes external validation concepts (e.g. concordance with external reports, previous literature and expert feedback) and internal consistency concepts which use expected associations within the dataset itself (e.g. completeness, uniformity and plausibility). In an example case, the selected concepts were applied to a transfusion dataset and specified in more detail.Results: Application of the approach to a transfusion dataset resulted in a structured overview of data validity aspects. This allowed improvement of these aspects through further processing of the data and in some cases adjustment of the data extraction. For example, the proportion of transfused products that could not be linked to the corresponding issued products initially was 2.2{\%} but could be improved by adjusting data extraction criteria to 0.17{\%}.Conclusions: This stepwise approach for validating linked multisource data provides a basis for evaluating data quality and enhancing interpretation. When the process of data validation is adopted more broadly, this contributes to increased transparency and greater reliability of research based on routinely collected electronic health records.",
keywords = "Data quality, Data validation, Linkage of multiple sources, Routinely collected data",
author = "{van Hoeven}, {Loan R.} and Bruijne, {Martine C.De} and Kemper, {Peter F.} and Koopman, {Maria M.W.} and Rondeel, {Jan M.M.} and Anja Leyte and Hendrik Koffijberg and Janssen, {Mart P.} and Roes, {Kit C.B.}",
year = "2017",
month = "7",
day = "14",
doi = "10.1186/s12911-017-0504-7",
language = "English",
volume = "17",
journal = "BMC medical informatics and decision making",
issn = "1472-6947",
publisher = "BioMed Central Ltd.",
number = "1",

}

van Hoeven, LR, Bruijne, MCD, Kemper, PF, Koopman, MMW, Rondeel, JMM, Leyte, A, Koffijberg, H, Janssen, MP & Roes, KCB 2017, 'Validation of multisource electronic health record data: An application to blood transfusion data', BMC medical informatics and decision making, vol. 17, no. 1, 107. https://doi.org/10.1186/s12911-017-0504-7

Validation of multisource electronic health record data : An application to blood transfusion data. / van Hoeven, Loan R.; Bruijne, Martine C.De; Kemper, Peter F.; Koopman, Maria M.W.; Rondeel, Jan M.M.; Leyte, Anja; Koffijberg, Hendrik; Janssen, Mart P.; Roes, Kit C.B.

In: BMC medical informatics and decision making, Vol. 17, No. 1, 107, 14.07.2017.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - Validation of multisource electronic health record data

T2 - An application to blood transfusion data

AU - van Hoeven, Loan R.

AU - Bruijne, Martine C.De

AU - Kemper, Peter F.

AU - Koopman, Maria M.W.

AU - Rondeel, Jan M.M.

AU - Leyte, Anja

AU - Koffijberg, Hendrik

AU - Janssen, Mart P.

AU - Roes, Kit C.B.

PY - 2017/7/14

Y1 - 2017/7/14

N2 - Background: Although data from electronic health records (EHR) are often used for research purposes, systematic validation of these data prior to their use is not standard practice. Existing validation frameworks discuss validity concepts without translating these into practical implementation steps or addressing the potential influence of linking multiple sources. Therefore we developed a practical approach for validating routinely collected data from multiple sources and to apply it to a blood transfusion data warehouse to evaluate the usability in practice.Methods: The approach consists of identifying existing validation frameworks for EHR data or linked data, selecting validity concepts from these frameworks and establishing quantifiable validity outcomes for each concept. The approach distinguishes external validation concepts (e.g. concordance with external reports, previous literature and expert feedback) and internal consistency concepts which use expected associations within the dataset itself (e.g. completeness, uniformity and plausibility). In an example case, the selected concepts were applied to a transfusion dataset and specified in more detail.Results: Application of the approach to a transfusion dataset resulted in a structured overview of data validity aspects. This allowed improvement of these aspects through further processing of the data and in some cases adjustment of the data extraction. For example, the proportion of transfused products that could not be linked to the corresponding issued products initially was 2.2% but could be improved by adjusting data extraction criteria to 0.17%.Conclusions: This stepwise approach for validating linked multisource data provides a basis for evaluating data quality and enhancing interpretation. When the process of data validation is adopted more broadly, this contributes to increased transparency and greater reliability of research based on routinely collected electronic health records.

AB - Background: Although data from electronic health records (EHR) are often used for research purposes, systematic validation of these data prior to their use is not standard practice. Existing validation frameworks discuss validity concepts without translating these into practical implementation steps or addressing the potential influence of linking multiple sources. Therefore we developed a practical approach for validating routinely collected data from multiple sources and to apply it to a blood transfusion data warehouse to evaluate the usability in practice.Methods: The approach consists of identifying existing validation frameworks for EHR data or linked data, selecting validity concepts from these frameworks and establishing quantifiable validity outcomes for each concept. The approach distinguishes external validation concepts (e.g. concordance with external reports, previous literature and expert feedback) and internal consistency concepts which use expected associations within the dataset itself (e.g. completeness, uniformity and plausibility). In an example case, the selected concepts were applied to a transfusion dataset and specified in more detail.Results: Application of the approach to a transfusion dataset resulted in a structured overview of data validity aspects. This allowed improvement of these aspects through further processing of the data and in some cases adjustment of the data extraction. For example, the proportion of transfused products that could not be linked to the corresponding issued products initially was 2.2% but could be improved by adjusting data extraction criteria to 0.17%.Conclusions: This stepwise approach for validating linked multisource data provides a basis for evaluating data quality and enhancing interpretation. When the process of data validation is adopted more broadly, this contributes to increased transparency and greater reliability of research based on routinely collected electronic health records.

KW - Data quality

KW - Data validation

KW - Linkage of multiple sources

KW - Routinely collected data

UR - http://www.scopus.com/inward/record.url?scp=85023766359&partnerID=8YFLogxK

U2 - 10.1186/s12911-017-0504-7

DO - 10.1186/s12911-017-0504-7

M3 - Article

AN - SCOPUS:85023766359

VL - 17

JO - BMC medical informatics and decision making

JF - BMC medical informatics and decision making

SN - 1472-6947

IS - 1

M1 - 107

ER -