AI_INFN Technical Meeting
→
Europe/Rome
Description
Virtual meeting room (zoom): https://l.infn.it/ai-infn-meeting
AI_INFN Technical meeting – Minutes and actions
Date: 2025-01-13
Operations
- We have restarted the minio cluster providing resources to JuiceFS and moved the metadata engine from Redis to PostgreSQL. No performance degradation was observed. Drafted dashboards for PostgreSQL and Minio.
- On January 11, failures of cvmfs fuse mounts were reported. Problem identified and (hopefully) fixed: squid proxy was configured without defining the port (3128).
Tracked developments:
Automation of RKE2 deployments in INFN Cloud
- The button in the INFN Cloud dashboard redirecting to the AI_INFN platform has been added.
For the moment it is only visible in the development-version of the PaaS and for users in the admin/ai-infn group. - Gioacchino pushed the docker images of the JupyterLab and of the JupyterHub in the repository of DataCloud with automated build.
- Observed strange old-version of snakemake in conda, in principle not installed from Docker image.
- Lucio will try to use Gioacchino’s image as default image in AI_INFN Platform.
Develop monitoring and accounting infrastructure (R. Petrini)
- The grafana dashboard has been extended to:
- monitor the hackathon platform
- monitor the Postgres used for the new JuiceFS
- monitor the MinIO instance used by the new JuiceFS (this works only partially: 1/3 dashboards)
- Improving the monitoring of postgres and minio requires some “invasive” action, at least for trial and error. It was decided to create a test environment with much more limited disk space but the same configuration as a playground for setting up the monitoring. This will also enable documenting the construction procedure of the storage cluster.
Environment setup (S. Giagu, S. Bordoni, L. Cappelli)
- Luca Clissa reported a problem cloning an environment. Lucio is following up.
Offloading tests with virtual kubelets (G. Bianchini, D. Ciangottini)
- Giulio (offline) - (almost) solved two issues raised by Lucio in interLink api server
Acquisto FPGA
- The game is in Diego Michelotto’s ballpark: the most promising strategy is installing the Xilinx software in the bare metal and then re-insert the higher level in the virtualization.
- Unfortunately Alma 9 is not officially supported by Xilinx. So, it’s hard to guarantee success.
- Enrico Calore reports on a discussion he had with the engineers from Xilinx at the SuperComputing conference. VitisAI won’t be supported for new hardware boards and VHDL/HLS will provide the most important toolchain for the upcoming FPGAs. Unfortunately V70 FPGAs can only be programmed using VitisAI.
- Andrea Rigoni supports the need for FPGAs in the Cloud, possibly close to GPU devices, for quantization studies: https://indico.cern.ch/event/1405026/contributions/5910214/attachments/2933286/5151597/cern_edge_ml_fpga_ai.pdf
Status legend
Active
Priority
Problems
Postponed or Blocked by others
Completed
There are minutes attached to this event.
Show them.