3–6 Feb 2026
Europe/Rome timezone

Ultra-Low-Latency Tree Tensor Network Inference on FPGAs

4 Feb 2026, 12:15
20m
Auditorium U12 - Guido Martinotti

Auditorium U12 - Guido Martinotti

Università degli Studi di Milano-Bicocca, Edificio U12, Via Vizzola, 5, 20126 Milano (MI)

Speaker

Lorenzo Borella (Istituto Nazionale di Fisica Nucleare)

Description

Tensor Networks (TNs) are a powerful computational framework originally developed for the efficient representation and simulation of quantum many-body systems. In recent years, they have gained increasing attention in machine learning (ML), demonstrating competitive performance in supervised learning tasks compared to conventional models.
In this work, we investigate the suitability of Tree Tensor Networks (TTNs) for high-frequency, real-time inference by exploiting the low-latency and high-throughput capabilities of Field-Programmable Gate Arrays (FPGAs). We present and evaluate multiple hardware implementations of TTN-based classifiers, targeting both standard ML benchmarks and complex datasets arising from physics applications.
During training, a systematic analysis is performed to determine optimal bond dimensions and weight quantization schemes. This analysis is informed by entanglement entropy and correlation function measurements, which provide insight into the representational capacity required by the model and guide the selection of the TTN architecture.
Following training, the TTN models are mapped onto a dedicated FPGA accelerator integrated within a server environment, with inference fully offloaded to hardware. This enables highly efficient, fully pipelined execution, achieving substantial reductions in inference latency. As a demonstrative application, we deploy a TTN-based classifier for a High Energy Physics (HEP) use case, achieving sub-microsecond inference latency while maintaining competitive classification performance.
These results demonstrate the feasibility of deploying quantum-inspired TN models within Level-1 trigger systems of HEP experiments, satisfying the stringent latency and throughput requirements while preserving robust classification performance. This work establishes TNs as a promising paradigm for real-time decision-making on specialized low-latency hardware platforms.

Sessions Quantum Machine Learning:
Invited No

Authors

Alberto Coppi (Istituto Nazionale di Fisica Nucleare) Dr Andrea Stanco (Università di Padova) Andrea Triossi (Universita` degli Studi di Padova) Jacopo Pazzini (Istituto Nazionale di Fisica Nucleare) Lorenzo Borella (Istituto Nazionale di Fisica Nucleare) Dr Marco Trenti Marco Zanetti (Istituto Nazionale di Fisica Nucleare)

Presentation materials