A multi-task Large Language Model for jets

29 Jul 2024, 17:20
20m
Palazzo Ducale (Genova, Italy)

Palazzo Ducale

Genova, Italy

Talk Novel Techniques Novel Techniques

Speaker

Humberto Reyes-Gonzalez (RWTH Aachen)

Description

Multi-head attention based Transformers have taken the world by storm, given their outstanding capacity of learning accurate representations of diverse types of data. Famous examples include Large Language Models, such as ChatGPT, and Vision Transformers, like BEiT, for image generation. In this talk, we take these major technological advancements to the realm of jet physics. By creating a discrete version of jet constituents, we let an Auto-regressive Transformer network learn the ‘language’ of jet substructures. We demonstrate that our Transformer model learns highly accurate representations of different types of jets, including precise predictions of their multiplicity, while providing explicit density estimation. Moreover, we show that the Transformer model can be used for a variety of tasks that involve both jet tagging and generation. Finally, we discuss how a pre-trained Transformer can be used as a baseline for fine-tuned models created for specific tasks for which data may be scarce.

Primary authors

Alexander Mück (RWTH Aachen) Humberto Reyes-Gonzalez (RWTH Aachen) Michael Krämer (RWTH Aachen)

Presentation materials