6–13 Jul 2022
Bologna, Italy
Europe/Rome timezone

Non-Parametric Data-Driven Background Modelling using Conditional Probabilities

9 Jul 2022, 18:00
15m
Room 12 (Celeste)

Room 12 (Celeste)

Parallel Talk Computing and Data handling Computing and Data handling

Speaker

Julia Manuela Silva (University of Birmingham)

Description

Background modelling is one of the main challenges of particle physics analyses at hadron colliders. Commonly employed strategies are the use of simulations based on Monte Carlo event generators or the use of parametric methods. However, sufficiently accurate simulations are not always available or may be computationally costly to produce in high statistics, leading to uncertainties that can limit the sensitivity of searches. On the other hand, parametric methods rely on the use of a functional form with free parameters to fit the observed data, which may bias the extraction of a potential signal.

A novel approach for non-parametric data-driven background modelling is presented, which addresses these issues for a broad class of searches and measurements [1]. This approach relies on a relaxed version of the event selection to estimate conditional probability density functions. Two different methods are provided for its implementation. The first is based on ancestral sampling and uses the data from the relaxed selection to obtain a graph of probability density functions of the relevant variables, accounting for the most significant correlations. A background model is generated by sampling events from this graph, before the full event selection is applied. This provides a robust implementation for cut-and-count based analyses. The strategy is further expanded in the second implementation, in which a generative adversarial network is trained to estimate the joint probability density function of the variables used in the analysis, conditioned on the variable used to blind the signal region. This training proceeds in the sidebands, and the conditional probability density function is interpolated into the signal region to estimate the background. The application of each implementation is presented and their performance is discussed.

[1] https://arxiv.org/abs/2112.00650

In-person participation Yes

Primary authors

Andrew Stephen Chisholm (University of Birmingham) Thomas Neep (University of Birmingham) Konstantinos Nikolopoulos (University of Birmingham) Rhys Owen (UKRI-STFC Rutherford Appleton Laboratory) Elliot Reynolds (Berkeley National Laboratory) Julia Manuela Silva (University of Birmingham)

Presentation materials