Skip to main navigation Skip to search Skip to main content

On-the-fly data set combinations with RNTuple

  • Florine Willemijn de Geus*
  • , Vincenzo Eduardo Padulano
  • , Jakob Blomer
  • , Philippe Canal
  • , Ana-Lucia Varbanescu
  • *Corresponding author for this work

Research output: Contribution to journalConference articleAcademicpeer-review

2 Downloads (Pure)

Abstract

With the expected data volume increase for HL-LHC and the even more complex computing challenges set by future colliders, the need for efficient data storage and processing becomes more pressing. ROOT’s next-generation data format and I/O subsystem, RNTuple, is designed to address these challenges. RNTuple already demonstrates a clear improvement in storage and I/O efficiency, as well as overall stability and robustness with respect to its predecessor, TTree. These improvements provide a solid baseline to introduce novel extensions to common high-energy and nuclear physics (HENP) workflows. Notably, many workflows could benefit from the ability to arbitrarily join and chain data set samples at runtime, which could reduce overall storage requirements and improve application runtime and ergonomics. In this paper, we present the RNTupleProcessor, which enables HENP data set combinations with RNTuple. We will discuss the main design considerations, present the interfaces to support data set combinations and show how they integrate in typical workflows.

Original languageEnglish
Article number01013
JournalEPJ Web of Conferences
Volume337
DOIs
Publication statusPublished - 7 Oct 2025
Event27th International Conference on Computing in High Energy and Nuclear Physics, CHEP 2024 - AGH University of Kraków, Krakow, Poland
Duration: 19 Oct 202425 Oct 2024
Conference number: 27
https://indico.cern.ch/event/1338689/

Fingerprint

Dive into the research topics of 'On-the-fly data set combinations with RNTuple'. Together they form a unique fingerprint.

Cite this