Speakers
Description
The INFN Cloud provides a federated portfolio of cloud services for scientific communities, leveraging on an Infrastructure as Code (IaC) approach for PaaS deployments. As the infrastructure scales, the legacy INDIGO PaaS Orchestration system - a monolithic, request-driven federation middleware - exhibits structural limitations in terms of maintainability, extensibility, and cross-site portability.
This work presents a complete architectural paradigm shift, transitioning the orchestrator to a decoupled, event-driven, and geo-distributed pipeline. The legacy core is being replaced by modular, stateless Python microservices communicating via Apache Kafka, which acts as the central event bus. This design decouples data ingestion, monitoring, event processing, and scheduling, ensuring intrinsic high availability and disaster recovery across geographically distributed clusters. Furthermore, the asynchronous, event-driven model streamlines distributed operations, allowing independent development units to isolate domains such as monitoring, data-driven scheduling, and external communications.
In parallel, the orchestration microservices are migrating to a Kubernetes-based infrastructure managed via ArgoCD, enforcing a strict GitOps model to simplify deployment, ensure continuous integration, and enhance traceability. The service portfolio is further expanded by federating Kubernetes-based providers, utilizing evolved TOSCA templates for automated deployments.
Resource allocation and quota management across multiple scientific communities are enforced using Capsule. Ultimately, these modernization efforts resolve existing technical debt and establish a highly scalable, secure, and sustainable foundation for the next generation of INFN Cloud federation.