The IMPrECISE system is a probabilistic XML database system which supports near-automatic integration of XML documents. What is required of the user is to configure the system with a few simple knowledge rules allowing the system to sufficiently eliminate nonsense possibilities. We demonstrate the integration process under conditions with varying degrees of confusion and different sets of rules.
Even when an integrated document still contains much uncertainty, it can be queried effectively. The system produces a sequence of possible result elements ranked by likelihood. User feedback on query results further reduces uncertainty which in a sense continues the semantic integration process incrementally. We demonstrate querying on integrated documents and measure answer quality with adapted precision and recall measures. The user feedback mechanism has not been implemented, hence cannot be demonstrated yet.
IMPrECISE has been implemented as an XQuery module for the XML DBMS MonetDB/XQuery. Therefore, the demo also illustrates the power of this XML DBMS and of XQuery as both a query and programming language.
|Title of host publication||08421 Abstracts Collection - Uncertainty Management in Information Systems|
|Editors||C. Koch, B. König-Ries, V. Markl, Maurice van Keulen|
|Place of Publication||Dagstuhl, Germany|
|Number of pages||1|
|Publication status||Published - Mar 2009|
|Event||Uncertainty Management in Information Systems: Dagstuhl Seminar 08421 - Dagstuhl, Germany, Dagstuhl, Germany|
Duration: 12 Oct 2008 → 17 Oct 2008
|Name||Dagstuhl Seminar Proceedings|
|Publisher||Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik|
|Workshop||Uncertainty Management in Information Systems|
|Period||12/10/08 → 17/10/08|
|Other||12 - 17 Oct 2008|
- Data Integration
- probabilistic databases
- data quality
- entity resolution
- Uncertainty management