Compression of Probabilistic XML documents

Irma Veldman, Ander de Keijzer, Maurice van Keulen

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

25 Downloads (Pure)

Abstract

Database techniques to store, query and manipulate data that contains uncertainty receives increasing research interest. Such UDBMSs can be classified according to their underlying data model: relational, XML, or RDF. We focus on uncertain XML DBMS with as representative example the Probabilistic XML model (PXML) of [9]. The size of a PXML document is obviously a factor in performance. There are PXML-specific techniques to reduce the size, such as a push down mechanism, that produces equivalent but more compact PXML documents. It can only be applied, however, where possibilities are dependent. For normal XML documents there also exist several techniques for compressing a document. Since Probabilistic XML is (a special form of) normal XML, it might benefit from these methods even more. In this paper, we show that existing compression mechanisms can be combined with PXML-specific compression techniques. We also show that best compression rates are obtained with a combination of PXML-specific technique with a rather simple generic DAG-compression technique.
Original languageUndefined
Title of host publicationProceedings of the 3rd International Conference on Scalable Uncertainty Management (SUM2009)
Place of PublicationBerlin
PublisherSpringer
Pages255-267
Number of pages14
ISBN (Print)978-3-642-04387-1
DOIs
Publication statusPublished - Sept 2009
Event3rd International Conference on Scalable Uncertainty Management 2009 - Samuel Riggs Alumni Center, Orem Hall C, of the University of Maryland, College Park, United States
Duration: 28 Sept 200930 Sept 2009
Conference number: 3
http://wwwinfo.deis.unical.it/apugliese/SUM2009/

Publication series

NameLecture Notes in Computer Science
PublisherSpringer Verlag
Volume5785
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference3rd International Conference on Scalable Uncertainty Management 2009
Abbreviated titleSUM 2009
Country/TerritoryUnited States
CityCollege Park
Period28/09/0930/09/09
Internet address

Keywords

  • IR-67795
  • DB-SDI: SCHEMA AND DATA INTEGRATION
  • EWI-15671
  • METIS-265216

Cite this