Fazenda, Alvaro Luiz; Mendes, Celso L.; Kale, Laxmikant V.; Panetta, Jairo; Rodrigues, Eduardo Rocha
Dynamic Load Balancing in GPU-Based Systems for a MPI program
2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 154-161, 2014

The dynamic load-balancing framework Charm++/AMPI, developed at the University of Illinois, is based on processor virtualization to allow thread migration across processors. This framework has been successfully applied to many scientific applications in the past, such as BRAMS, NAMD, ChaNGa, and others. Most of these applications use only CPUs, that is, they do not use accelerators. However, the use of GPUs to improve computational performance is quickly getting massively disseminated in the high-performance computing community. This paper aims to investigate how the same Charm++/AMPI framework can be extended to balance load in a synthetic application inspired by the BRAMS numerical forecast model, running on GPUs instead of CPUs. Many major questions involving the use of GPUs with AMPI where handled in this work, including: how to measure the GPU's load, how to use and share GPUs among user-level threads, and what results are obtained when applying the required over-decomposition technique to a GPU-accelerated program.

Find full text with Google Scholar.