14–17 May 2012
Sala SS. Marcellino e Festo - Largo San Marcellino, 10 - Napoli
Europe/Rome timezone
Le giornate del 15-16-17 maggio verranno trasmesse in <strong>diretta streaming HD</strong> nella codifica Flash. <p></p>I riferimenti alle pagine dello streaming verranno comunicati i giorni precedenti all'incontro.

Electromagnetic particle in cell simulations on GPU clusters: a case study

17 May 2012, 15:15
25m
Sala SS. Marcellino e Festo - Largo San Marcellino, 10 - Napoli

Sala SS. Marcellino e Festo - Largo San Marcellino, 10 - Napoli

Largo San Marcellino 10 - Napoli

Speaker

Francesco Rossi (Università di Bologna)

Description

We present Jasmine, an implementation of a fully relativistic, 3D, electromagnetic Particle In Cell (PIC) code, capable of running on hybrid HPC (High Performance Computing) clusters, exploiting the computational power of both CUDA GPUs and CPUs. The code modularity and the advanced C++ implementation allows for simple extension of the core algorithms to various simulation schemes. When porting a PIC scheme to a GPU based machine, the particle to grid operations (e.g. the evaluation of the current density) need special care to avoid memory inconsistencies. Here we present how we implemented this operation exploiting a parallel evaluation for each grid cell relying on a robust and efficient sorting and stream compaction algorithms. Running demanding simulations on GPUs comes with the great advantage of the high processing power available on the graphic boards at the expense of the rather limited memory available per board. We have tackled the GPU memory limitation problem streaming particle chunks asynchronously from the main node memory to the GPUs. This chunking technique can also be used to hide the network transfer overhead occurring in the multi-GPU parallelization. We show the comparison of the performance of the code Jasmine when run on different architectures: pure CPU (Intel Xeon), GPUs (NVIDIA Fermi board) or on a hybrid HPC cluster (Intel Xeon + NVIDIA Fermi). The single particle process time is 13 ns for the 2D case and 80 ns for the 3D one, for a relativistic plasma simulation with grid staggering, double precision and quadratic shape functions, running on a NVIDIA Fermi board.

Presentation materials

There are no materials yet.