An important challenge in machine learning is to predict the initial conditions under which a given neural network will be trainable. We present a method for predicting the trainable regime in parameter space for deep feedforward neural networks (DNNs) based on reconstructing the input from subsequent activation layers via a cascade of single-layer auxiliary networks. We show that a single...
Astrophysical sources vary across vast timescales, providing insight into extreme dynamical phenomena, from solar outbursts to distant AGNs and GRBs. These time-varying processes are often complex, nonlinear, and non-Gaussian, making it difficult to disentangle underlying causal mechanisms, which may act simultaneously or sequentially. Using solar variability and AGNs as examples, we...
Our work identifies the sources of 11 interconnected machine learning (ML) biases that hinder the generalisation of supervised learning models in the context of gravitational wave (GW) detection. We use GW domain knowledge to propose a set of mitigation tactics and training strategies for ML algorithms that aim to address these biases concurrently and improve detection sensitivity. We...
Neural network emulators or surrogates are widely used in astrophysics and cosmology to approximate expensive simulations, accelerating both likelihood-based inference and training for simulation-based inference. However, emulator accuracy requirements are often justified heuristically rather than with rigorous theoretical bounds. We derive a principled upper limit on the information loss...
The graph coloring problem is an optimization problem involving the assignment of one of q colors to each vertex of a graph such that no two adjacent vertices share the same color. This problem is computationally challenging and arises in several practical applications. We present a novel algorithm that leverages graph neural networks to tackle the problem efficiently, particularly for large...
Experimental studies of 𝑏-hadron decays face significant challenges due to a wide range of backgrounds arising from the numerous possible decay channels with similar final states. For a particular signal decay, the process for ascertaining the most relevant background processes necessitates a detailed analysis of final state particles, potential misidentifications, and kinematic overlaps...
The Large Hadron Collider (LHC) at CERN generates vast amounts of data from high-energy particle collisions, requiring advanced machine learning techniques for effective analysis. While Graph Neural Networks (GNNs) have demonstrated strong predictive capabilities in high-energy physics (HEP) applications, their "black box" nature often limits interpretability. To address this challenge, we...
In this conference contribution, we present our findings on applying Artificial Neural Networks (ANNs) to enhance off-vertex topology recognition using data from the HADES experiment at GSI, Darmstadt. Our focus is on decays of $\Lambda$ and K$^0_{\text{S}}$ particles produced in heavy ion as well as elementary reactions. We demonstrate how ANNs can enhance the separation of weak decays from...
The resolution of any detector is finite, leading to distortions in the measured distributions. Within physics research, the indispensable correction of these distortions is know as Unfolding. Machine learning research uses a different term for this very task: Quantification Learning. For the past two decades, this difference in terminology (and some differences in notation) have prevented...
Machine learning techniques are used to predict theoretical constraints—such as unitarity, boundedness from below, and the potential minimum—in multi-scalar models. This approach has been demonstrated to be effective when applied to various extensions of the Standard Model that incorporate additional scalar multiplets. A high level of predictivity is achieved through appropriate neural network...
Background: In High Energy Physics (HEP), jet tagging is a fundamental classification task that has been extensively studied using deep learning techniques. Among these, transformer networks have gained significant popularity due to their strong performance and intrinsic attention mechanisms. Furthermore, pre-trained transformer models are available for a wide range of classification...
Extracting continuum properties from discretized quantum field theories is significantly hindered by lattice artifacts. Fixed-point (FP) actions, defined via renormalization group transformations, offer an elegant solution by suppressing these artifacts even on coarse lattices. In this work, we employ gauge-covariant convolutional neural networks to parameterize an FP action for...
While cross sections are the fundamental experimental observables in scattering processes, the full quantum dynamics of the interactions are encoded in the complex-valued scattering amplitude. Since cross sections depend only on the squared modulus of the amplitude, reconstructing the complete information from nuclear and particle physics experiments becomes a challenging inverse problem. In...
Understanding hadron structure requires the extraction of Quantum Correlation Functions (QCFs), such as parton distribution functions and fragmentation functions, from experimental data. The extraction of QCFs involves solving an inversion problem, which is ill-posed due to errors and limitations in the experimental data.
To address this challenge, we propose a novel method for extracting...
Today, many physics experiments rely on Machine Learning (ML) methods to support their data analysis pipelines. Although ML has revolutionized science, most models are still difficult to interpret and lack clarity of the process with which they calculate results and the way they utilize information from used datasets. In this work, we introduce physics-guided ML methods that keep the...
The adoption of AI-based techniques in theoretical research is often slower than in other fields due to the perception that AI-based methods lack rigorous validation against theoretical counterparts. In this talk, we introduce COEmuNet, a surrogate model designed to emulate carbon monoxide (CO) line radiation transport in stellar atmospheres.
COEmuNet is based on a three-dimensional...
The Matrix-Element Method (MEM) has long been a cornerstone of data analysis in high-energy physics. It leverages theoretical knowledge of parton-level processes and symmetries to evaluate the likelihood of observed events. We combine MEM-inspired symmetry considerations with equivariant neural network design for particle physics analysis. Even though Lorentz invariance and permutation...
https://arxiv.org/abs/2501.03921
Simulation-based inference is undergoing a renaissance in statistics and machine learning. With several packages implementing the state-of-the-art in expressive AI [mackelab/sbi] [undark-lab/swyft], it is now being effectively applied to a wide range of problems in the physical sciences, biology, and beyond.
Given the rapid pace of AI/ML, there is little...