Managing Uncertainty: The Road Towards Better Data Interoperability

M. Herschel (Editor), Maurice van Keulen

Research output: Contribution to journalArticleAcademicpeer-review

40 Downloads (Pure)

Abstract

Data interoperability encompasses the many data management activities needed for effective information management in anyone´s or any organization´s everyday work such as data cleaning, coupling, fusion, mapping, and information extraction. It is our conviction that a significant amount of money and time in IT that is devoted to these activities, is about dealing with one problem: “semantic uncertainty‿. Sometimes data is subjective, incomplete, not current, or incorrect, sometimes it can be interpreted in different ways, etc. In our opinion, clean correct data is only a special case, hence data management technology should treat data quality problems as a fact of life, not as something to be repaired afterwards. Recent approaches treat uncertainty as an additional source of information which should be preserved to reduce its impact. We believe that the road towards better data interoperability, is to be found in teaching our data processing tools and systems about all forms of doubt and how to live with them. In this paper, we show for several data interoperability use cases (deduplication, data coupling/fusion, and information extraction) how to formally model the associated data quality problems as semantic uncertainty. Furthermore, we provide an argument why our approach leads to better data interoperability in terms of natural problem exposure and risk assessment, more robustness and automation, reduced development costs, and potential for natural and effective feedback loops leveraging human attention.
Original languageUndefined
Pages (from-to)138-146
Number of pages9
JournalIT - Information Technology
Volume54
Issue number3
DOIs
Publication statusPublished - May 2012

Keywords

  • IR-80560
  • EWI-21947
  • METIS-287885

Cite this

@article{b03d2ecd316e4858bcd0e37123d129f5,
title = "Managing Uncertainty: The Road Towards Better Data Interoperability",
abstract = "Data interoperability encompasses the many data management activities needed for effective information management in anyone´s or any organization´s everyday work such as data cleaning, coupling, fusion, mapping, and information extraction. It is our conviction that a significant amount of money and time in IT that is devoted to these activities, is about dealing with one problem: “semantic uncertainty‿. Sometimes data is subjective, incomplete, not current, or incorrect, sometimes it can be interpreted in different ways, etc. In our opinion, clean correct data is only a special case, hence data management technology should treat data quality problems as a fact of life, not as something to be repaired afterwards. Recent approaches treat uncertainty as an additional source of information which should be preserved to reduce its impact. We believe that the road towards better data interoperability, is to be found in teaching our data processing tools and systems about all forms of doubt and how to live with them. In this paper, we show for several data interoperability use cases (deduplication, data coupling/fusion, and information extraction) how to formally model the associated data quality problems as semantic uncertainty. Furthermore, we provide an argument why our approach leads to better data interoperability in terms of natural problem exposure and risk assessment, more robustness and automation, reduced development costs, and potential for natural and effective feedback loops leveraging human attention.",
keywords = "IR-80560, EWI-21947, METIS-287885",
author = "M. Herschel and {van Keulen}, Maurice",
note = "eemcs-eprint-21947",
year = "2012",
month = "5",
doi = "10.1524/itit.2012.0674",
language = "Undefined",
volume = "54",
pages = "138--146",
journal = "IT - Information Technology",
issn = "1611-2776",
publisher = "de Gruyter",
number = "3",

}

Managing Uncertainty: The Road Towards Better Data Interoperability. / Herschel, M. (Editor); van Keulen, Maurice.

In: IT - Information Technology, Vol. 54, No. 3, 05.2012, p. 138-146.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - Managing Uncertainty: The Road Towards Better Data Interoperability

AU - van Keulen, Maurice

A2 - Herschel, M.

N1 - eemcs-eprint-21947

PY - 2012/5

Y1 - 2012/5

N2 - Data interoperability encompasses the many data management activities needed for effective information management in anyone´s or any organization´s everyday work such as data cleaning, coupling, fusion, mapping, and information extraction. It is our conviction that a significant amount of money and time in IT that is devoted to these activities, is about dealing with one problem: “semantic uncertainty‿. Sometimes data is subjective, incomplete, not current, or incorrect, sometimes it can be interpreted in different ways, etc. In our opinion, clean correct data is only a special case, hence data management technology should treat data quality problems as a fact of life, not as something to be repaired afterwards. Recent approaches treat uncertainty as an additional source of information which should be preserved to reduce its impact. We believe that the road towards better data interoperability, is to be found in teaching our data processing tools and systems about all forms of doubt and how to live with them. In this paper, we show for several data interoperability use cases (deduplication, data coupling/fusion, and information extraction) how to formally model the associated data quality problems as semantic uncertainty. Furthermore, we provide an argument why our approach leads to better data interoperability in terms of natural problem exposure and risk assessment, more robustness and automation, reduced development costs, and potential for natural and effective feedback loops leveraging human attention.

AB - Data interoperability encompasses the many data management activities needed for effective information management in anyone´s or any organization´s everyday work such as data cleaning, coupling, fusion, mapping, and information extraction. It is our conviction that a significant amount of money and time in IT that is devoted to these activities, is about dealing with one problem: “semantic uncertainty‿. Sometimes data is subjective, incomplete, not current, or incorrect, sometimes it can be interpreted in different ways, etc. In our opinion, clean correct data is only a special case, hence data management technology should treat data quality problems as a fact of life, not as something to be repaired afterwards. Recent approaches treat uncertainty as an additional source of information which should be preserved to reduce its impact. We believe that the road towards better data interoperability, is to be found in teaching our data processing tools and systems about all forms of doubt and how to live with them. In this paper, we show for several data interoperability use cases (deduplication, data coupling/fusion, and information extraction) how to formally model the associated data quality problems as semantic uncertainty. Furthermore, we provide an argument why our approach leads to better data interoperability in terms of natural problem exposure and risk assessment, more robustness and automation, reduced development costs, and potential for natural and effective feedback loops leveraging human attention.

KW - IR-80560

KW - EWI-21947

KW - METIS-287885

U2 - 10.1524/itit.2012.0674

DO - 10.1524/itit.2012.0674

M3 - Article

VL - 54

SP - 138

EP - 146

JO - IT - Information Technology

JF - IT - Information Technology

SN - 1611-2776

IS - 3

ER -