Taming Data Explosion in Probabilistic Information Integration

Ander de Keijzer, Maurice van Keulen, Yiping Li

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

13 Downloads (Pure)

Abstract

Data integration has been a challenging problem for decades. In autonomous data integration, i.e., without a user to solve semantic uncertainty and conflicts between data sources, it even becomes a serious bottleneck. A probabilistic approach seems promising as it does not require extensive semantic annotations nor user interaction at integration time. It simply teaches the application how to generically cope with uncertainty. Unfortunately, without any world knowledge, uncertainty abounds as almost everything becomes (theoretically) possible and maintaining all possibilities produces huge volumes of data. In this paper, we claim that simple and generic knowledge rules are sufficient to drastically reduce uncertainty, hence tame data explosion to a manageable size.
Original languageUndefined
Title of host publicationPre-Proceedings of the International Workshop on Inconsistency and Incompleteness in Databases (IIDB 2006)
PublisherUniversity of Mons-Hainaut, Belgium
Pages82-86
Number of pages5
ISBN (Print)not assigned
Publication statusPublished - 26 Mar 2006

Publication series

Name
PublisherUniversity of Mons-Hainaut, Belgium

Keywords

  • IR-66509
  • METIS-238693
  • DB-SDI: SCHEMA AND DATA INTEGRATION
  • EWI-7537

Cite this

de Keijzer, A., van Keulen, M., & Li, Y. (2006). Taming Data Explosion in Probabilistic Information Integration. In Pre-Proceedings of the International Workshop on Inconsistency and Incompleteness in Databases (IIDB 2006) (pp. 82-86). University of Mons-Hainaut, Belgium.
de Keijzer, Ander ; van Keulen, Maurice ; Li, Yiping. / Taming Data Explosion in Probabilistic Information Integration. Pre-Proceedings of the International Workshop on Inconsistency and Incompleteness in Databases (IIDB 2006). University of Mons-Hainaut, Belgium, 2006. pp. 82-86
@inproceedings{e581f641883c43d298813754e43e4ef8,
title = "Taming Data Explosion in Probabilistic Information Integration",
abstract = "Data integration has been a challenging problem for decades. In autonomous data integration, i.e., without a user to solve semantic uncertainty and conflicts between data sources, it even becomes a serious bottleneck. A probabilistic approach seems promising as it does not require extensive semantic annotations nor user interaction at integration time. It simply teaches the application how to generically cope with uncertainty. Unfortunately, without any world knowledge, uncertainty abounds as almost everything becomes (theoretically) possible and maintaining all possibilities produces huge volumes of data. In this paper, we claim that simple and generic knowledge rules are sufficient to drastically reduce uncertainty, hence tame data explosion to a manageable size.",
keywords = "IR-66509, METIS-238693, DB-SDI: SCHEMA AND DATA INTEGRATION, EWI-7537",
author = "{de Keijzer}, Ander and {van Keulen}, Maurice and Yiping Li",
note = "Position paper. Pre-proceedings can be obtained from workshop website (http://ssi.umh.ac.be/iidb) or Jef Wijsen, Institut d'Informatique, Universit{\~A}{\circledC} de Mons-Hainaut, B-7000 Mons, Belgium.",
year = "2006",
month = "3",
day = "26",
language = "Undefined",
isbn = "not assigned",
publisher = "University of Mons-Hainaut, Belgium",
pages = "82--86",
booktitle = "Pre-Proceedings of the International Workshop on Inconsistency and Incompleteness in Databases (IIDB 2006)",

}

de Keijzer, A, van Keulen, M & Li, Y 2006, Taming Data Explosion in Probabilistic Information Integration. in Pre-Proceedings of the International Workshop on Inconsistency and Incompleteness in Databases (IIDB 2006). University of Mons-Hainaut, Belgium, pp. 82-86.

Taming Data Explosion in Probabilistic Information Integration. / de Keijzer, Ander; van Keulen, Maurice; Li, Yiping.

Pre-Proceedings of the International Workshop on Inconsistency and Incompleteness in Databases (IIDB 2006). University of Mons-Hainaut, Belgium, 2006. p. 82-86.

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

TY - GEN

T1 - Taming Data Explosion in Probabilistic Information Integration

AU - de Keijzer, Ander

AU - van Keulen, Maurice

AU - Li, Yiping

N1 - Position paper. Pre-proceedings can be obtained from workshop website (http://ssi.umh.ac.be/iidb) or Jef Wijsen, Institut d'Informatique, Université de Mons-Hainaut, B-7000 Mons, Belgium.

PY - 2006/3/26

Y1 - 2006/3/26

N2 - Data integration has been a challenging problem for decades. In autonomous data integration, i.e., without a user to solve semantic uncertainty and conflicts between data sources, it even becomes a serious bottleneck. A probabilistic approach seems promising as it does not require extensive semantic annotations nor user interaction at integration time. It simply teaches the application how to generically cope with uncertainty. Unfortunately, without any world knowledge, uncertainty abounds as almost everything becomes (theoretically) possible and maintaining all possibilities produces huge volumes of data. In this paper, we claim that simple and generic knowledge rules are sufficient to drastically reduce uncertainty, hence tame data explosion to a manageable size.

AB - Data integration has been a challenging problem for decades. In autonomous data integration, i.e., without a user to solve semantic uncertainty and conflicts between data sources, it even becomes a serious bottleneck. A probabilistic approach seems promising as it does not require extensive semantic annotations nor user interaction at integration time. It simply teaches the application how to generically cope with uncertainty. Unfortunately, without any world knowledge, uncertainty abounds as almost everything becomes (theoretically) possible and maintaining all possibilities produces huge volumes of data. In this paper, we claim that simple and generic knowledge rules are sufficient to drastically reduce uncertainty, hence tame data explosion to a manageable size.

KW - IR-66509

KW - METIS-238693

KW - DB-SDI: SCHEMA AND DATA INTEGRATION

KW - EWI-7537

M3 - Conference contribution

SN - not assigned

SP - 82

EP - 86

BT - Pre-Proceedings of the International Workshop on Inconsistency and Incompleteness in Databases (IIDB 2006)

PB - University of Mons-Hainaut, Belgium

ER -

de Keijzer A, van Keulen M, Li Y. Taming Data Explosion in Probabilistic Information Integration. In Pre-Proceedings of the International Workshop on Inconsistency and Incompleteness in Databases (IIDB 2006). University of Mons-Hainaut, Belgium. 2006. p. 82-86