8–10 Apr 2026
DAMA Tecnopolo - Bologna
Europe/Rome timezone

Foundation Models for Robust and Sample-Independent Jet Flavour Tagging

9 Apr 2026, 10:00
15m
Presentazione orale Frontiera dell'Energia Frontiera dell'Energia

Speaker

Mr Riccardo Riva (ATLAS-CERN)

Description

The precise identification of jets initiated by heavy-flavour quarks is a central challenge in the physics programme of the Large Hadron Collider (LHC). Modern flavour tagging algorithms rely on deep learning models trained on labelled Monte Carlo (MC) simulated data. A key limitation of these approaches is that the embedding learned from the input variables is typically tightly coupled to the specific MC generator used during training. This limitation can lead to significant discrepancies when these taggers are applied on MC simulations produced with other generators or on real data, enhancing systematics and reducing the analysis sensitivity. As a consequence, changes in the simulation setup often require a dedicated recalibration of the model output, and there is a risk that the network captures generator-specific artefacts rather than genuine physical features.
In this work, we propose a novel strategy to address this issue by constructing embeddings that are robust to variations in the underlying MC simulation. Our approach is based on adopting foundation models for tabular data to generate explicit embeddings. We introduce a pre-training phase to learn a jet-level representation that is invariant across different MC generators, such that jets with identical physical properties are mapped to the same embedding independently of the simulation used to produce them.

Author

Mr Riccardo Riva (ATLAS-CERN)

Presentation materials