TY - BOOK
T1 - Taming Data Explosion in Probabilistic Information Integration
AU - de Keijzer, Ander
AU - van Keulen, Maurice
AU - Li, Yiping
N1 - Extended version of IIDB-paper.
PY - 2006/2
Y1 - 2006/2
N2 - Data integration has been a challenging problem for decades. In an ambient environment, where many autonomous devices have their own information sources and network connectivity is ad hoc and peer-to-peer, it even becomes a serious bottleneck. To enable devices to exchange information without the need for interaction with a user at data integration time and without the need for extensive semantic annotations, a probabilistic approach seems rather promising. It simply teaches the device how to cope with the uncertainty occurring during data integration. Unfortunately, without any kind of world knowledge, almost everything becomes uncertain, hence maintaining all possibilities produces huge integrated information sources. In this paper, we claim that only very simple and generic rules are enough world knowledge to drastically reduce the amount of uncertainty, hence to tame the data explosion to a manageable size.
AB - Data integration has been a challenging problem for decades. In an ambient environment, where many autonomous devices have their own information sources and network connectivity is ad hoc and peer-to-peer, it even becomes a serious bottleneck. To enable devices to exchange information without the need for interaction with a user at data integration time and without the need for extensive semantic annotations, a probabilistic approach seems rather promising. It simply teaches the device how to cope with the uncertainty occurring during data integration. Unfortunately, without any kind of world knowledge, almost everything becomes uncertain, hence maintaining all possibilities produces huge integrated information sources. In this paper, we claim that only very simple and generic rules are enough world knowledge to drastically reduce the amount of uncertainty, hence to tame the data explosion to a manageable size.
KW - DB-SDI: SCHEMA AND DATA INTEGRATION
M3 - Report
T3 - CTIT Technical Report Series
BT - Taming Data Explosion in Probabilistic Information Integration
PB - Centre for Telematics and Information Technology (CTIT)
CY - Enschede
ER -