The CNAF Tier-1 facility provides computing resources to over 60 scientific communities (not limited to LHC) and supports more than 1500 active users through its dedicated User Support (US) department. As the number of users grows and new technologies continue to evolve, the US department plays a crucial role in helping users efficiently utilize the computing infrastructure and adopt the latest software technologies. When necessary, other specialized CNAF departments collaborate with US to address specific requests, enriching the ticketing system with tutorials and technical expertise that would be challenging to include in the Tier-1 User Guide, despite its significance.
To improve the efficiency of the US department, predicting (and potentially automating) the involvement of relevant departments for specific user requests is a key objective. Modern Natural Language Processing (NLP) techniques offer promising solutions for accurate classification of user communications, even in CNAF’s multilingual environment (English/Italian).
The integration of Foundation Models (FMs) such as GPT-4, Llama 3 or Gemini further enhance these efforts. Employing a Retrieval Augmented Generation (RAG) architecture powered by a FM allows to define a semantic-aware pipeline able to extract relevant information from a preprocessed knowledge base. This knowledge base can include the Tier-1 User Guide, the ticketing system, and other carefully selected web resources (e.g., software docs). These foundations pave the way for developing a "digital agent" capable of automating user responses and streamlining support workflows using injected, context-aware information.