In hybrid systems, the parallel processing capability of GPUs can be harnessed for general purpose computation. This accelerator model of computation is helping HPC facilities follow the path to the Exascale, at which it is recognized that there must be extremely large numbers of simplistic cores to enable the required performance with power requirements remaining realistic. To allow real applications to effectively exploit current GPU architectures, significant effort is required. We present the GPU enabling of HPC application codes in the areas of fluid dynamics, nuclear fusion and particle physics. Key computational kernels were identified and ported to the NVIDIA GPU architecture (including Fermi) using CUDA. The raw porting achieved only modest acceleration, but more dramatic performance gains were achieved through optimizations including rearrangement of data structures, utilization of on-chip memory and registers, tuning of the decomposition, reduction of unnecessary CPU-GPU data transfer and the usage of optimized libraries.

Chair/Author Details:

Alan Gray - Edinburgh Parallel Computing Centre

Alan Richardson - Massachusetts Institute of Technology

Karthee Sivalingam - Edinburgh Parallel Computing Centre

Alistair Hart - Cray Inc.

Iain Bethune - Edinburgh Parallel Computing Centre

