Speaker
Description
The increasing convergence of High‑Performance Computing (HPC) and Artificial Intelligence (AI) is transforming scientific computing infrastructures used by large research organizations such as INFN. Driven by shared reliance on accelerators, high‑speed interconnects, advanced storage, and open software ecosystems, HPC platforms are increasingly required to support data‑intensive AI workloads alongside traditional simulation‑driven science.
This contribution explores where this convergence is occurring across scientific applications, enabling technologies, and deployment models, and examines the resulting challenges for large‑scale research infrastructures. Rapid growth in processor and accelerator power densities introduces significant constraints related to power delivery, thermal management, and facility design, pushing conventional air‑cooled approaches beyond their practical limits.
Using real‑world examples of large AI training systems and AI‑factory‑style reference architectures, the presentation highlights the role of direct liquid cooling in improving energy efficiency, reducing operational overhead, and enabling sustainable scaling. The session concludes with key considerations for designing future‑proof HPC–AI infrastructures supporting next‑generation scientific research.