AI_INFN Technical Meeting


Virtual meeting room (zoom):

AI_INFN Technical meeting – Minutes and actions

Date: 2024-07-22


Tracked developments:

:arrow_forward: Automation of RKE2 deployments in INFN Cloud

  • [Open since last week] We need to schedule the upgrade of kubernetes versions in the platform
    • Discussion: the procedure to upgrade the Kubernetes cluster involves multiple phases and the reboot of the masters.
      It is a bad idea to start just before the summer vacations. The plan is to solve the known issues before the summer,
      but to postpone the actual upgrade to fall 2024.

:arrow_forward: Develop monitoring and accounting infrastructure (R. Petrini)

  • The postgres database is being backed-up on a Ceph volume mounted POSIX in the VM running the accounting service.

:arrow_forward: Environment setup (M. Barbetti, S. Giagu, S. Bordoni, L. Cappelli)

  • The three tutorial notebooks developed by WP4 are running in the prepared environment, we should now finalize them and make them available in the KB.
  • Ferrara is working to bring the GNN use case on the platform.

:arrow_forward: Offloading tests with virtual kubelets (G. Bianchini, D. Ciangottini)

  • Recovery of the distributed infrastructre after update to interlink 0.3.0
  • We have updated succesfully:
    • the ReCaS backend (kueue plugin on INFN Cloud K8s)
    • the T4 backend at Cloud@CNAF (docker plugin on a VM)
  • Still broken:
    • the RTX-5000 backend at Cloud@CNAF (kueue plugin on INFN Cloud K8s)
    • the CloudVeneto backend (still some failures to be followed up)
  • We have not started the procedure yet for:
    • The kueue plugin in interlink CE
    • The virtual node in the platform

In these summer days the effort to maintain this complicated infrastructure is limited.

A helm chart with monitoring infrastructure is being prepared by Giulio.

:arrow_forward: Acquisto FPGA

  • Macchinone 4 is being installed (An RTX is being installed together with the two V70).

:arrow_forward: Advanced Hackathon

  • The rooms in UniPD are only available for three days, but we need four. We should converge by the end of this week.

Status legend

:arrow_forward: Active
:fast_forward: Priority
:bangbang: Problems
:parking: Postponed or Blocked by others
:white_check_mark: Completed

There are minutes attached to this event. Show them.
    • 16:00 16:15
      News and setup 15m
      Speaker: Lucio Anderlini (Istituto Nazionale di Fisica Nucleare)
    • 16:15 16:35
      Emulating the Quantum Computer on FPGAs 20m
      Speaker: Mirko Mariotti (Istituto Nazionale di Fisica Nucleare)
    • 16:35 16:50
      Discussion on tasks and priorities 15m
      Speaker: All
    • 16:50 17:00
      Any other business 10m