Compression of Probabilistic XML documents

Irma Veldman, Ander de Keijzer, Maurice van Keulen

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

48 Downloads (Pure)

Abstract

Database techniques to store, query and manipulate data that contains uncertainty receives increasing research interest. Such UDBMSs can be classified according to their underlying data model: relational, XML, or RDF. We focus on uncertain XML DBMS with as representative example the Probabilistic XML model (PXML) of [9]. The size of a PXML document is obviously a factor in performance. There are PXML-specific techniques to reduce the size, such as a push down mechanism, that produces equivalent but more compact PXML documents. It can only be applied, however, where possibilities are dependent. For normal XML documents there also exist several techniques for compressing a document. Since Probabilistic XML is (a special form of) normal XML, it might benefit from these methods even more. In this paper, we show that existing compression mechanisms can be combined with PXML-specific compression techniques. We also show that best compression rates are obtained with a combination of PXML-specific technique with a rather simple generic DAG-compression technique.
Original languageEnglish
Title of host publicationScalable Uncertainty Management
Subtitle of host publicationThird International Conference, SUM 2009, Washington, DC, USA, September 28-30, 2009, Proceedings
EditorsLluís Godo, Andrea Pugliese
Place of PublicationBerlin
PublisherSpringer
Pages255-267
Number of pages14
ISBN (Electronic)978-3-642-04388-8
ISBN (Print)978-3-642-04387-1
DOIs
Publication statusPublished - Sept 2009
Event3rd International Conference on Scalable Uncertainty Management 2009 - Samuel Riggs Alumni Center, Orem Hall C, of the University of Maryland, College Park, United States
Duration: 28 Sept 200930 Sept 2009
Conference number: 3
http://wwwinfo.deis.unical.it/apugliese/SUM2009/

Publication series

NameLecture Notes in Computer Science
PublisherSpringer Verlag
Volume5785
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference3rd International Conference on Scalable Uncertainty Management 2009
Abbreviated titleSUM 2009
Country/TerritoryUnited States
CityCollege Park
Period28/09/0930/09/09
Internet address

Keywords

  • DB-SDI: SCHEMA AND DATA INTEGRATION

Fingerprint

Dive into the research topics of 'Compression of Probabilistic XML documents'. Together they form a unique fingerprint.

Cite this