AI_INFN Technical Meeting

Europe/Rome
Description

Virtual meeting room (zoom): https://l.infn.it/ai-infn-meeting

Date: 2024-03-18

The discussion on the operations and tools for operations is postponed to next week.

We will need to switch off M2 (and all the virtual machines it hosts) for about one hour, on Thrusday.

Submission of abstract for the CCR meeting in Palau is open. If you plan to submit activities related to AI_INFN, please get in touch to coordinate the effort.

Tracked developments:

:fast_forward: Tests on deployments with RKE2 (L. Anderlini, R. Petrini, G. Misurelli, M. Corvo)

  • In a meeting with the PMB of DataCloud (Friday 15), the requirements for deploying the platform in the PaaS were discussed.
  • DataCloud is supportive and WP5 committed to move the discussion forward.

:arrow_forward: Port monitoring infrastructure to Helm chart (R. Petrini)

  • Still no reply from DataCloud PMB on the possility of using the centralized grafana instance. We have saturated the number of administrators (users) available in the free tier of Grafana Cloud (3 admins).
  • The problem related to the profiling of RTX5000 GPU makes the metrics previously used in grafana empty. We moved to another, less “autoexplicative” metric that still provide relative indication of how much the GPU is used, but we lost indication on the fraction of GPU memory actually used.
  • No additional work was done on the accounting database (need effort to make it secure and backupped). Lucio will ping Stefano Dal Pra.
  • Work on the dashboard: we are studying Dash.

:arrow_forward: Define a list of libraries for QC simulations in Cloud (S. Giagu, S. Bordoni)

  • Matteo is looking at the QC environment.

:arrow_forward: Offloading tests with virtual kubelets (G. Bianchini, D. Ciangottini)

  • The simplified helm chart developed by Giulio has been deployed on INFN Cloud resources.
  • The target is a single VM with an nVidia T4 GPU, provided by AI_INFN.
  • The deployment for the upcoming tests is: https://jhub.131.154.98.92.myip.cloud.infn.it

:arrow_forward: Acquisto FPGA

  • NTR.

:arrow_forward: User’s forum

  • The indico page of the User’s forum (June 11-12) is accessible here: https://agenda.infn.it/event/40489/
  • Please, get in touch with Elisabetta and Matteo if you wish to present something.

Status legend

:arrow_forward: Active
:fast_forward: Priority
:bangbang: Problems
:parking: Postponed or Blocked by others
:white_check_mark: Completed

There are minutes attached to this event. Show them.
    • 1
      News and setup
      Speaker: Lucio Anderlini (Istituto Nazionale di Fisica Nucleare)
    • 2
      Setting up Day-2 Operations [Postponed]
      Speaker: Giuseppe Misurelli (Istituto Nazionale di Fisica Nucleare)
    • 3
      Discussion on tasks and priorities
      Speaker: All
    • 4
      Any other business