AI_INFN Technical Meeting

Europe/Rome

AI_INFN Technical meeting – Minutes and actions

Date: 2024-03-04

An update on the migration is discussed. See slides.

In summary,

  • We aim at migrating on Wednesday 6, ALL INFN Cloud PaaS deployment must be deleted by tomorrow, Tuesday 5
  • Since Thursday 7 we hope to start re-enabling deployments
  • The deployment https://jhub.131.154.97.212.myip.cloud.infn.it will remain available until the end of this week
  • Next Monday, at this meeting, we will have a final opportunity to object to its deletion and request for a further (short) delay.

Tracked developments:

:fast_forward: Tests on deployments with RKE2 (L. Anderlini, R. Petrini, G. Misurelli, M. Corvo)

  • NTR

:arrow_forward: Port monitoring infrastructure to Helm chart (R. Petrini)

  • Developed a custom exporter for NFS. Found a problem in the custom exporter used to monitor the number of files and the overall size of the NFS exports (when the number of files increases, the exporter gets start using a large amount of RAM and the node start failing). We will try looking into this before migrating (i.e. by tomorrow).
  • Grafana, we are currently using grafana.net to develop the dashboards while waiting for a decision by DataCloud on whether to host our dashboard in the central grafana instance or not.
  • Problems with the monitoring of an RTX node. We are investigating. Hopefully it will self-heal by migrating everything freshly in the new zone…

:arrow_forward: Define a list of libraries for QC simulations in Cloud (S. Giagu, S. Bordoni)

  • NTR

:arrow_forward: Offloading tests with virtual kubelets (G. Bianchini, D. Ciangottini)

  • G. Bianchini sent apologizes (teaching at FPGA course)
    • The development of the example helm chart to setup the client for VK submissions is progressing,
      the work on the documentation continues.

:arrow_forward: Acquisto FPGA

  • Discussione in corso per decidere quale server acquistare, potrebbe essere vantaggioso acquistare un server per le FPGA. Capitolato preparato e verificato da Chierici e Dal Pra. A breve convergeremo.
  • L’acquisto verrà gestito centralmente al CNAF, per server e V70.
  • Dal Pra: dobbiamo chiedere qualcosa in modo preventivo quando si fa l’ordine delle fpga? (Tipo connettori…)
    • S. Giagu: Le V70 non le abbiamo mai avute in mano, quindi non lo sappiamo. L’alimentazione viene da PCIe.
    • E. Calore: La U55c invece ha bisogno di alimentazione aggiuntiva con il connettore (pochi euro, comunque)

:arrow_forward: User’s forum

  • The dates defined last week are not a viable option. Will propose new dates.

Status legend

:arrow_forward: Active
:fast_forward: Priority
:bangbang: Problems
:parking: Postponed or Blocked by others
:white_check_mark: Completed

Ci sono verbali allegati a questo evento. Mostrali.
    • 16:00 16:15
      News and setup 15m
      Relatore: Lucio Anderlini (Istituto Nazionale di Fisica Nucleare)
    • 16:15 16:50
      Discussion on tasks and priorities 35m
      Relatore: All
    • 16:50 17:00
      Any other business 10m