10–12 Sept 2014
University of Pisa
Europe/Rome timezone

Conjugate Gradient solvers on Intel Xeon Phi and NVIDIA GPUs

10 Sept 2014, 14:30
30m
University of Pisa

University of Pisa

<a target="_blank" href=https://www.google.com/maps/place/Dipartimento+di+Fisica/@43.720239,10.407985,17z/data=!3m1!4b1!4m2!3m1!1s0x12d591bb7d8c8ec9:0xbf91ddd442e32978>Polo Fibonacci</a> Largo Bruno Pontecorvo, 3 I-56127 Pisa <em>phone +39 050 2214 327</em>

Speaker

Mr Patrick Steinbrecher (Fakultät für Physik, Universität Bielefeld)

Description

The runtime of a Lattice QCD simulation is dominated by a small kernel, which calculates the product of a vector by a sparse matrix known as the “Dslash” operator. Therefore, this kernel is frequently optimized for various HPC architectures. In this contribution we want evaluate the performance of the Intel Xeon Phi to current Kepler-based NVIDIA Tesla GPUs running a conjugate gradient solver. By exposing more parallelism to the accelerator by inverting multiple vectors at the same time we obtain a performance >250 GFLOPs/s on both architectures. This more than doubles the performance of the naive separate inversion. A detailed comparison of the performance of the accelerators for different scenarios will be presented in the talk. We also discuss some details of the implementation and the effort required to obtain the achieved performance.

Primary authors

Dr Christian Schmidt (Fakultät für Physik, Universität Bielefeld) Dr Mathias Wagner (Department of Physics, Indiana University) Mukherjee Swagato (Brookhaven National Laboratory) Dr Olaf Kaczmarek (Fakultät für Physik, Universität Bielefeld) Mr Patrick Steinbrecher (Fakultät für Physik, Universität Bielefeld)

Presentation materials