AI_INFN Technical meeting – Minutes and actions
Date: 2025-03-17
News
- Database consuntivi: deadline 9/4.
- “XXII Seminar on Software for Nuclear, Subnuclear and Applied Physics, Alghero”, registration deadline May 10th, 2025. [Link]
- “Introductory Course to VHDL and HLS FPGA Programming, Milano”, by ICSC-Spoke 2 - WP4 agenda, registration deadline April 30th.
Operations
- Update of RKE2 cluster took more than expected. Offloading still to be resurrected. See slides.
Tracked developments:
Automation of RKE2 deployments in INFN Cloud
- March 3
- Gioacchino tagged the image with a new version schema. Snakemake is now available.
- Gioacchino is working on Jupyter for INFN Cloud, trying to remain aligned with AI_INFN.
- Plan migration to Jupyter 5 together with Gioacchino.
- The new-named image will be the default one in the platform soon.
- March 10
- Update stopped due to the authentication is being forced every minute; to be understood;
the development activity is stopped for a course onging on the same infrastructure.
The course should be concluded by tomorrow.
Develop monitoring and accounting infrastructure (R. Petrini)
- March 10
- We are running without monitoring for the file system, which also implies no accounting. Will try to mitigate this on Wednesday afternoon while restarting the services.
- March 17
- Meeting tomorrow to kick-off the monitoring of the storage
Environment setup (S. Giagu, S. Bordoni, L. Cappelli)
- March 17
- Giuliano Panico (owner of an A30 GPU in the AI_INFN tenancy) is migrating his activity in the platform. Issues with environment setup are being followed-up by Francesca.
Offloading tests with virtual kubelets (G. Bianchini, D. Ciangottini)
- March 3
- 3 buckets made available; not yet tested;
- Work on GPU continues.
- Offloading towards FPGA. SSH tunnel is not working any longer and it is not clear why.
- March 10
- Offloading verso GPU: aggiornato il plugin NATS per utilizzare “SlurmFlavor” per supportare l’utilizzo di GPU. Il Flavor viene selezionato scorrendo sui flavor disponibili dal più economico al più costoso.
- Offloading verso FPGA. Il VK di interlink supporta il provisioning di FPGA con il plugin docker. Si può schedulare un pod che richiede FPGA così come si richiede la GPU. Nel Jupyter notebook si possono già usare tutti i tool della Xilinx. Prove fatte con una U55c a Perugia e le prime prove “semplici” sembrano tutte funzionanti in modo corretto. Il sistema potrebbe essere fatto funzionare anche con V70.
- Stefano Dal Pra organizza una call per organizzare i test.
Acquisto FPGA
- March 10
- Stefano G. is sending the U55c FPGA to CNAF;
- Lucio asks to remove one the V70 to send it to Ferrara to continue the test with two different hypervisors and collect additional information.
Status legend
Active
Priority
Problems
Postponed or Blocked by others
Completed
There are minutes attached to this event.
Show them.