Speaker
Description
The LHCb HLT2 Farm comprises ~4,500 nodes and ~260k CPU cores, primarily dedicated to real-time data processing. Extended idle periods during technical stops represent a significant, under exploited computing opportunity.
We present an HTCondor-based opportunistic computing model that integrates the HLT2 Farm into the LHCb distributed computing infrastructure while preserving strict priority for online activities. A custom HTCondor ClassAd evaluates node-level transitions between Idle and Not Idle HLT2 states, enabling Grid job execution only when both STARTD policies and state evaluation return TRUE. Upon reclamation by HLT2, running jobs receive a SIGUSR1 signal for graceful termination at event boundaries, ensuring compatibility with Gaudi workflows.
The system integrates with LHCbDirac for automated pilot submission.
Since commissioning, the infrastructure has processed a considerable fraction of MC jobs.
This work demonstrates a viable strategy for bridging online and offline computing, maximizing resource efficiency while maintaining operational safety—directly relevant to future hybrid resource models.