AI_INFN Technical Meeting
→
Europe/Rome
Description
Virtual meeting room (zoom): https://l.infn.it/ai-infn-meeting
Date: 2024-06-24
Last week we had an incident with the platform due to failure in unmounting fuse in test jobs.
The nodes needed reboot. Rebooting broke calico.
The nodes needed to be re-built. Rebuilding broke CUDA.
GPU operator has been updated.
Tracked developments:
Automation of RKE2 deployments in INFN Cloud
- NTR
Develop monitoring and accounting infrastructure (R. Petrini)
- We need an automated backup of the postgres database used for the accounting
Environment setup (M. Barbetti, S. Giagu, S. Bordoni, L. Cappelli)
- conda lascia sporco l’environment; la soluzione è usare unset e fare pulizia.
- una volta fatti i bottoni, ri-pinghiamo il WP4.
- scriveremo da qualche parte nella documentazione come ripulire gli environment per i prossimi utenti.
Offloading tests with virtual kubelets (G. Bianchini, D. Ciangottini)
- Docker-plugin has been integrated in the interlink-CE.
Acquisto FPGA
- NTR
Advanced Hackathon
- Andrea Paccagnella proposes Padova.
- Possible week: 25 November 2024
- Asked the secretariat for the rooms. Waiting for reply.
Status legend
Active
Priority
Problems
Postponed or Blocked by others
Completed
There are minutes attached to this event.
Show them.