3–7 Jun 2019
Hotel Hermitage - Isola d'Elba
Europe/Rome timezone

Operational Intelligence

5 Jun 2019, 15:45
45m
Sala Maria Luisa (Hotel Hermitage - Isola d'Elba)

Sala Maria Luisa

Hotel Hermitage - Isola d'Elba

La Biodola 57037 Portoferraio (Li) Tel. +39.0565 9740 http://www.hotelhermitage.it/
Orale Machine Learning Tecnologie Software e ML

Speaker

Federica Legger (TO)

Description

Large international scientific collaborations will face in the near future unprecedented computing challenges. Processing and storing multi-PetaByte datasets require a global federated infrastructure of distributed computing resources. The current systems have proven to be mature and capable of meeting the experiment goals, by allowing timely delivery of physics results. However, substantial manual interventions from experts and shifters is needed to efficiently operate such heterogeneous infrastructures. On the other hand, logging information from computing services and systems is being archived and becoming available on big data solutions such as ElasticSearch, Hadoop, no-SQL database, etc. We plan to exploit such wealth of information to increase the level of automation in computing operations by using adequate techniques, such as machine learning (ML), tailored to solve specific problems. ML models applied to the prediction of data placements and access pattern can be used to increase the efficiency of resource exploitation and the overall throughput of the experiment distributed computing infrastructures. Time-series applications can be used to estimate the time needed to complete certain tasks, such as processing a certain number of events, or transferring a certain amount of data. Anomaly detection techniques can be employed to predict system failures, leading for example to network congestion. Recording and analysing shifter actions can be used to automatise tasks such as creating tickets to computing centers, or suggesting possible solutions to typical issues. We discuss how the use of common state-of-the-art technologies across experiments can be the key to build general solutions that can be pulled into the various scientific domains and used across different experiments.

Primary authors

Daniele Bonacorsi (BO) Alessandro Di Girolamo (INFN) Federica Legger (TO)

Presentation materials