Adaptive Inference of Fine-grained Data Provenance to Achieve High Accuracy at Lower Storage Costs

Mohammad Rezwanul Huq, Andreas Wombacher, Peter M.G. Apers

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

8 Citations (Scopus)

Abstract

In stream data processing, data arrives continuously and is processed by decision making, process control and e-science applications. To control and monitor these applications, reproducibility of result is a vital requirement. However, it requires massive amount of storage space to store fine-grained provenance data especially for those transformations with overlapping sliding windows. In this paper, we propose techniques which can significantly reduce storage costs and can achieve high accuracy. Our evaluation shows that adaptive inference technique can achieve almost 100% accurate provenance information for a given dataset at lower storage costs than the other techniques. Moreover, we present a guideline about the usage of different provenance collection techniques described in this paper based on the transformation operation and stream characteristics.
Original languageEnglish
Title of host publication7th IEEE International Conference on E-Science, e-Science 2011
Place of PublicationPiscataway, NJ
PublisherIEEE Computer Society
Pages202-209
Number of pages8
ISBN (Electronic)978-0-7695-4597-4
ISBN (Print)978-1-4577-2163-2
DOIs
Publication statusPublished - Dec 2011
Event7th IEEE International Conference on e-Science 2011 - Stockholm, Sweden
Duration: 5 Dec 20118 Dec 2011
Conference number: 7

Conference

Conference7th IEEE International Conference on e-Science 2011
Abbreviated titlee-Science
CountrySweden
CityStockholm
Period5/12/118/12/11

Fingerprint

Process control
Costs
Decision making

Keywords

  • METIS-285072
  • IR-79577
  • Inference
  • EWI-21400
  • Stream Data
  • Storage
  • Fine grained data provenance

Cite this

Huq, M. R., Wombacher, A., & Apers, P. M. G. (2011). Adaptive Inference of Fine-grained Data Provenance to Achieve High Accuracy at Lower Storage Costs. In 7th IEEE International Conference on E-Science, e-Science 2011 (pp. 202-209). Piscataway, NJ: IEEE Computer Society. https://doi.org/10.1109/eScience.2011.36
Huq, Mohammad Rezwanul ; Wombacher, Andreas ; Apers, Peter M.G. / Adaptive Inference of Fine-grained Data Provenance to Achieve High Accuracy at Lower Storage Costs. 7th IEEE International Conference on E-Science, e-Science 2011. Piscataway, NJ : IEEE Computer Society, 2011. pp. 202-209
@inproceedings{20d1a95f04a04e62b2df5425e5e457ee,
title = "Adaptive Inference of Fine-grained Data Provenance to Achieve High Accuracy at Lower Storage Costs",
abstract = "In stream data processing, data arrives continuously and is processed by decision making, process control and e-science applications. To control and monitor these applications, reproducibility of result is a vital requirement. However, it requires massive amount of storage space to store fine-grained provenance data especially for those transformations with overlapping sliding windows. In this paper, we propose techniques which can significantly reduce storage costs and can achieve high accuracy. Our evaluation shows that adaptive inference technique can achieve almost 100{\%} accurate provenance information for a given dataset at lower storage costs than the other techniques. Moreover, we present a guideline about the usage of different provenance collection techniques described in this paper based on the transformation operation and stream characteristics.",
keywords = "METIS-285072, IR-79577, Inference, EWI-21400, Stream Data, Storage, Fine grained data provenance",
author = "Huq, {Mohammad Rezwanul} and Andreas Wombacher and Apers, {Peter M.G.}",
note = "eemcs-eprint-21400",
year = "2011",
month = "12",
doi = "10.1109/eScience.2011.36",
language = "English",
isbn = "978-1-4577-2163-2",
pages = "202--209",
booktitle = "7th IEEE International Conference on E-Science, e-Science 2011",
publisher = "IEEE Computer Society",
address = "United States",

}

Huq, MR, Wombacher, A & Apers, PMG 2011, Adaptive Inference of Fine-grained Data Provenance to Achieve High Accuracy at Lower Storage Costs. in 7th IEEE International Conference on E-Science, e-Science 2011. IEEE Computer Society, Piscataway, NJ, pp. 202-209, 7th IEEE International Conference on e-Science 2011, Stockholm, Sweden, 5/12/11. https://doi.org/10.1109/eScience.2011.36

Adaptive Inference of Fine-grained Data Provenance to Achieve High Accuracy at Lower Storage Costs. / Huq, Mohammad Rezwanul; Wombacher, Andreas ; Apers, Peter M.G.

7th IEEE International Conference on E-Science, e-Science 2011. Piscataway, NJ : IEEE Computer Society, 2011. p. 202-209.

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

TY - GEN

T1 - Adaptive Inference of Fine-grained Data Provenance to Achieve High Accuracy at Lower Storage Costs

AU - Huq, Mohammad Rezwanul

AU - Wombacher, Andreas

AU - Apers, Peter M.G.

N1 - eemcs-eprint-21400

PY - 2011/12

Y1 - 2011/12

N2 - In stream data processing, data arrives continuously and is processed by decision making, process control and e-science applications. To control and monitor these applications, reproducibility of result is a vital requirement. However, it requires massive amount of storage space to store fine-grained provenance data especially for those transformations with overlapping sliding windows. In this paper, we propose techniques which can significantly reduce storage costs and can achieve high accuracy. Our evaluation shows that adaptive inference technique can achieve almost 100% accurate provenance information for a given dataset at lower storage costs than the other techniques. Moreover, we present a guideline about the usage of different provenance collection techniques described in this paper based on the transformation operation and stream characteristics.

AB - In stream data processing, data arrives continuously and is processed by decision making, process control and e-science applications. To control and monitor these applications, reproducibility of result is a vital requirement. However, it requires massive amount of storage space to store fine-grained provenance data especially for those transformations with overlapping sliding windows. In this paper, we propose techniques which can significantly reduce storage costs and can achieve high accuracy. Our evaluation shows that adaptive inference technique can achieve almost 100% accurate provenance information for a given dataset at lower storage costs than the other techniques. Moreover, we present a guideline about the usage of different provenance collection techniques described in this paper based on the transformation operation and stream characteristics.

KW - METIS-285072

KW - IR-79577

KW - Inference

KW - EWI-21400

KW - Stream Data

KW - Storage

KW - Fine grained data provenance

U2 - 10.1109/eScience.2011.36

DO - 10.1109/eScience.2011.36

M3 - Conference contribution

SN - 978-1-4577-2163-2

SP - 202

EP - 209

BT - 7th IEEE International Conference on E-Science, e-Science 2011

PB - IEEE Computer Society

CY - Piscataway, NJ

ER -

Huq MR, Wombacher A, Apers PMG. Adaptive Inference of Fine-grained Data Provenance to Achieve High Accuracy at Lower Storage Costs. In 7th IEEE International Conference on E-Science, e-Science 2011. Piscataway, NJ: IEEE Computer Society. 2011. p. 202-209 https://doi.org/10.1109/eScience.2011.36