Rule-based conditioning of probabilistic data

Maurice van Keulen*, Benjamin Kaminski, Christoph Matheja, Joost P. Katoen

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

1 Citation (Scopus)
230 Downloads (Pure)

Abstract

Data interoperability is a major issue in data management for data science and big data analytics. Probabilistic data integration (PDI) is a specific kind of data integration where extraction and integration problems such as inconsistency and uncertainty are handled by means of a probabilistic data representation. This allows a data integration process with two phases: (1) a quick partial integration where data quality problems are represented as uncertainty in the resulting integrated data, and (2) using the uncertain data and continuously improving its quality as more evidence is gathered. The main contribution of this paper is an iterative approach for incorporating evidence of users in the probabilistically integrated data. Evidence can be specified as hard or soft rules (i.e., rules that are uncertain themselves).

Original languageEnglish
Title of host publicationScalable Uncertainty Management
Subtitle of host publication12th International Conference, SUM 2018, Milan, Italy, October 3-5, 2018, Proceedings
EditorsDavide Ciucci, Gabriella Pasi, Barbara Vantaggi
Place of PublicationCham
PublisherSpringer
Pages290-305
Number of pages16
ISBN (Electronic)978-3-030-00461-3
ISBN (Print)978-3-030-00460-6
DOIs
Publication statusPublished - 1 Jan 2018
Event12th International Conference on Scalable Uncertainty Management 2018 - Milan, Italy
Duration: 3 Oct 20185 Oct 2018
Conference number: 12
http://www.ir.disco.unimib.it/sum2018/

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume11142
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference12th International Conference on Scalable Uncertainty Management 2018
Abbreviated titleSUM 2018
Country/TerritoryItaly
CityMilan
Period3/10/185/10/18
Internet address

Keywords

  • Data cleaning
  • Data integration
  • Information extraction
  • Probabilistic databases
  • Probabilistic programming

Fingerprint

Dive into the research topics of 'Rule-based conditioning of probabilistic data'. Together they form a unique fingerprint.

Cite this