





### Research activities

Luca Giambastiani

LHCb-Padova meeting - 17/11/2020

# Readout in LHCb-Upgrade

- Context of my research activity: LHCb-Upgrade and heterogeneous computing on FPGA.
- Event Builder (EB): computer farm for detector readout and event building. Parts of the same event arrives at different EB nodes 
  A high-speed network (Event Builder Network) connects all nodes, allowing to assemble each event in one single node before being sent to HLT.
- Data are received by means of a readout board, installed in each server, called **PCIe40**. It is a custom PCI express board equipped with 24 optical links and an Intel Arria 10 FPGA for data processing. It sends data to server's memory via the PCIe connector.



## Thesis work: RETINA clustering

- **RETINA**: Realtime tracking on FPGA for future upgrades (beyond Run 3) of LHCb. Tracking in the VErtex LOcator (VELO) needs the use of pre-processed clusters (group of active pixels). The purpose of my research activity during master thesis was to perform clustering on FPGAs in the readout boards.
- The first prototype was developed in VHDL language and tested in Pisa on an Stratix V device, but it was not conceived for integration in the overall firmware. **Work done**:
  - removal of auxiliary components;
  - porting on the architecture of the Arria 10 FPGAs installed in readout boards;
  - implementation of components needed for a complete integration;
  - optimizations to the existing firmware.
- Debugging and testing to ensure that all modifications work correctly.
- Tools used:
  - QuestaSim for firmware simulation;
  - Quartus for IPs (Intellectual Properties) generation;
  - VHDL language.

#### Thesis work: Performance studies

- Clustering and tracking qualities analyzed and compared with the baseline CPU algorithm. Studies performed with LHCb simulation, with the addition of a software emulator of the clustering firmware.
- Clustering qualities are found to be very similar, tracking qualities are hardly distinguishable. A bias on the reconstruction of primary vertex position has been found and solved.
- Throughput gain: 8% for HLT1 on CPUs, 3.5% for HLT1 on GPUs.





# The RETINA project

- Real-time tracking on FPGA for future upgrades (beyond Run 3) of LHCb. Each EB node will be equipped with another FPGA board (*Retina board*). A dedicated network allows data exchange between them.
- Each Retina board receives data from the PCIe40 and reconstructs some of the tracks of the event.
- Finally it returns them in the server's memory, together with original data. This is achieved with PCIe gen3 DMA.



# INFN scholarship: DMA transfers for Retina boards

- Driver and firmware for DMA transfers between Retina boards and memory are based on those of the PCIe40, but:
  - The Retina boards must be capable of data transfers in both directions, while the PCIe40 has only to write data into server's memory;
  - The FPGA is different.
- Two board models are used for tests, equipped with two different FPGA: Arria V and Stratix 10.
- A firmware was available from the beginning of the work, which was based on the PCIe40 design ported to Arria V. It had two data streams, one for each direction, but the server-to-board stream (*Rtna Stream*) was not working. Its completion is one of the main goals of the activity.



# INFN scholarship: DMA streams with the Arria V board

- The firmware for the rtna stream has been corrected. It works correctly under simulation (QuestaSim).
- We tried to "short circuit" the two streams together (board side) to test if the firmware is capable of simultaneous transfers without interferences between the streams.
  - Data sent in the rtna stream should be read back from the main stream.
  - In QuestaSim simulation, the firmware in this configuration stalls after few DMA bundles.
  - This problem is suspected to be due to a problem in the simulation of the root complex.
- FPGA programmed with compiled firmware to test the short circuited firmware in real hardware. This test is still ongoing.



# INFN scholarship: The Stratix 10 board

- First goal: check if it is seen by the server and if we can write and read back data with DMA. Used two PCIe example design, for both x8 and x16 widths.
  - ► FPGA succesfully programmed with the compiled firmware.
  - There are problems with the interaction with the board: read data aren't equal to written data; system crashes.
- Second goal: Porting of the Arria V-based firmware to the Stratix 10 chip. This includes setting up a simulation environment with QuestaSim and porting all IPs (Intellectual Properties) to the Stratix 10 architecture.
  - The most important IP system is that responsible for PCIe interface. Its logical structure has been modified because one of its sub-component has a different implementation in the Stratix 10.

