Speaker
Dr
Enrico Calore
(FE)
Description
An increasing number of massively-parallel machines are based on heterogeneous node architectures combining traditional powerful multicore CPUs with energy-efficient accelerators. Programming heterogeneous systems can be cumbersome and designing efficient codes can result a hard task.
The lack of standard programming frameworks for accelerator based machines makes it more complex; in fact in most of the cases best efficiency can only be achieved rewriting the code, usually written in C or C++, using proprietary programming languages such as CUDA.
OpenACC offers a different approach based on directives. Porting applications to run on hybrid architectures "only" requires to annotate existing codes with specific "pragma" instructions. These identify functions to be executed on accelerators, and instruct the compiler on how to generate and structure code for specific target device.
In this talk we present our experience in designin and optimizing a LQCD code targeted for multi-GPU cluster machines, giving details about the implementation and presenting preliminary results.
Author
Dr
Enrico Calore
(FE)