EuCAIFCon 2025

Name: EuCAIFCon 2025
Start: 2025-06-16T00:02:00+02:00
End: 2025-06-20T14:00:00+02:00
Location: THotel, Cagliari, Sardinia, Italy

16–20 Jun 2025

THotel, Cagliari, Sardinia, Italy

Europe/Rome timezone

Contact

✨ Event Tokenization and Next-Token Prediction for Anomaly Detection at the LHC

18 Jun 2025, 17:22

T1a+T1b

Poster Session B Patterns & Anomalies 🔀 Simulations & Generative Models

Ambre Visive (Nikhef - University of Amsterdam)

Advances in Machine Learning, particularly Large Language Models (LLMs), enable more efficient interaction with complex datasets through tokenization and next-token prediction strategies. This talk presents and compares various approaches to structuring particle physics data as token sequences, allowing LLM-inspired models to learn event distributions and detect anomalies via next-token (or masked token) prediction. Trained only on background events, the model reconstructs expected physics processes. At inference, both background and signal events are processed, with reconstruction scores identifying deviations from learned patterns—flagging potential anomalies. This event tokenization strategy not only enables anomaly detection but also represents a potential new approach for training a foundation model at the LHC. The method is tested on simulated proton-proton collision data from the Dark Machines Collaboration and applied to a four-top-quark search, replicating ATLAS conditions during LHC Run 2 ($\sqrt{s} = 13 \text{ TeV}$). Results are compared with other anomaly detection strategies.

AI keywords	anomaly detection; tokenization; Large-Language Model; transformers; next-token prediction

Ambre Visive (Nikhef - University of Amsterdam)

Dr Clara Nellist (Nikhef - University of Amsterdam) Mrs Polina Moskvitina (Nikhef - Radboud University) Dr Roberto Ruiz de Austri (Valencia University, IFIC) Dr Sascha Caron (Nikhef - Radboud University)

Flashtalk.pdf

Poster.pdf

EuCAIFCon 2025

Contact

✨ Event Tokenization and Next-Token Prediction for Anomaly Detection at the LHC

T1a+T1b

Speaker

Description

Primary author

Co-authors

Presentation materials

Choose timezone

EuCAIFCon 2025

Contact

Speaker

Description

Primary author

Co-authors

Presentation materials