AI_INFN Technical Meeting
→
Europe/Rome
Description
Virtual meeting room (zoom): https://l.infn.it/ai-infn-meeting
AI_INFN Technical meeting – Minutes and actions
Date: 2024-07-22
News:
- Open call for the PRIN “InTrEPID: In vivo 3D dosimetry in radiotherapy Treatments with EPID” in Pisa.
Subject: “Development of machine learning models for in vivo dosimetry in radiotherapy”
Deadline: 28/7
More info: https://reclutamento.dsi.infn.it/ (call 26904) - The agenda of the Workshop “Computing@CSN5” applications and innovations at INFN is [online]
(https://agenda.infn.it/event/42127/timetable/#20241014)
Tracked developments:
Automation of RKE2 deployments in INFN Cloud
- [Open since last week] We need to schedule the upgrade of kubernetes versions in the platform
- Discussion: the procedure to upgrade the Kubernetes cluster involves multiple phases and the reboot of the masters.
It is a bad idea to start just before the summer vacations. The plan is to solve the known issues before the summer,
but to postpone the actual upgrade to fall 2024.
- Discussion: the procedure to upgrade the Kubernetes cluster involves multiple phases and the reboot of the masters.
Develop monitoring and accounting infrastructure (R. Petrini)
- The postgres database is being backed-up on a Ceph volume mounted POSIX in the VM running the accounting service.
Environment setup (M. Barbetti, S. Giagu, S. Bordoni, L. Cappelli)
- The three tutorial notebooks developed by WP4 are running in the prepared environment, we should now finalize them and make them available in the KB.
- Ferrara is working to bring the GNN use case on the platform.
Offloading tests with virtual kubelets (G. Bianchini, D. Ciangottini)
- Recovery of the distributed infrastructre after update to interlink 0.3.0
- We have updated succesfully:
- the ReCaS backend (kueue plugin on INFN Cloud K8s)
- the T4 backend at Cloud@CNAF (docker plugin on a VM)
- Still broken:
- the RTX-5000 backend at Cloud@CNAF (kueue plugin on INFN Cloud K8s)
- the CloudVeneto backend (still some failures to be followed up)
- We have not started the procedure yet for:
- The kueue plugin in interlink CE
- The virtual node in the platform
In these summer days the effort to maintain this complicated infrastructure is limited.
A helm chart with monitoring infrastructure is being prepared by Giulio.
Acquisto FPGA
- Macchinone 4 is being installed (An RTX is being installed together with the two V70).
Advanced Hackathon
- The rooms in UniPD are only available for three days, but we need four. We should converge by the end of this week.
Status legend
Active
Priority
Problems
Postponed or Blocked by others
Completed
There are minutes attached to this event.
Show them.