Speaker
Description
The software toolbox used for "big data" analysis in the last few years is rapidly changing. The adoption of software design approaches able to exploit the new hardware architectures and improve code expressiveness plays a pivotal role in boosting data processing speed, resources optimisation, analysis portability and analysis preservation.
The scientific collaborations in the field of High Energy Physics (e.g. the LHC experiments, the next-generation neutrino experiments, and many more) are devoting increasing resources to the development and implementation of bleeding-edge software technologies in order to cope effectively with always growing data samples, pushing the reach of the single experiment and of the whole HEP community.
The introduction of declarative paradigms in the analysis description and implementation is growing interest and support in the main collaborations. This approach can simplify and speed-up the analysis description phase, support the portability of the analyses among different datasets/experiments and strengthen the preservation and reproducibility of the results. Furthermore this approach, providing a deep decoupling between the analysis algorithm and back-end implementation, is a key element for present and future processing speed, potentially even with back-ends not existing today.
A framework characterised by a declarative paradigm for the analysis description and able to operate on datasets from different experiments is under development in the frame of the ICSC (Centro Nazionale di Ricerca in HPC, Big Data and Quantum Computing, Italy). The Python-based demonstrator provides a declarative interface for the implementation of an analysis of HEP event data, with support for different input data formats and for the extension to longer chains of analysis tasks.
Status and development plan of the demonstrator will be discussed.