Treffer: Bridging the gap between OpenMP and task-based runtime systems for the fast multipole method

Title:

Bridging the gap between OpenMP and task-based runtime systems for the fast multipole method

Authors:

Agullo, Emmanuel, Aumage, Olivier, Bramas, Bérenger, Coulaud, Olivier, Pitoiset, Samuel

Contributors:

High-End Parallel Algorithms for Challenging Numerical Simulations (HiePACS), Laboratoire Bordelais de Recherche en Informatique (LaBRI), Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Inria Bordeaux - Sud-Ouest, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), STatic Optimizations, Runtime Methods (STORM), Max Planck Computing and Data Facility Garching (MPCDF), Inria ADT K'STAR, Plafrim

Source:

ISSN: 1045-9219 ; IEEE Transactions on Parallel and Distributed Systems ; https://hal.inria.fr/hal-01517153 ; IEEE Transactions on Parallel and Distributed Systems, Institute of Electrical and Electronics Engineers, 2017, pp.14. ⟨10.1109/TPDS.2017.2697857⟩.

Publisher Information:

HAL CCSD
Institute of Electrical and Electronics Engineers

Publication Year:

2017

Collection:

Archive ouverte HAL (Hyper Article en Ligne, CCSD - Centre pour la Communication Scientifique Directe)

Subject Terms:

parallel programming model, compiler, OpenMP, runtime system, fast multipole method, high performance computing, priority, commutativity, multicore architecture, [INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC], [INFO.INFO-MO]Computer Science [cs]/Modeling and Simulation, [INFO.INFO-PF]Computer Science [cs]/Performance [cs.PF], [INFO.INFO-PL]Computer Science [cs]/Programming Languages [cs.PL]

Document Type:

Fachzeitschrift article in journal/newspaper

Language:

English

Relation:

hal-01517153; https://hal.inria.fr/hal-01517153; https://hal.inria.fr/hal-01517153/document; https://hal.inria.fr/hal-01517153/file/tpds_kstar_scalfmm_print.pdf

DOI:

10.1109/TPDS.2017.2697857

Availability:

https://hal.inria.fr/hal-01517153
https://hal.inria.fr/hal-01517153/document
https://hal.inria.fr/hal-01517153/file/tpds_kstar_scalfmm_print.pdf
https://doi.org/10.1109/TPDS.2017.2697857

Rights:

info:eu-repo/semantics/OpenAccess

Accession Number:

edsbas.C13D4521

Database:

BASE

Weitere Informationen

International audience ; With the advent of complex modern architectures, the low-level paradigms long considered sufficient to build High Performance Computing (HPC) numerical codes have met their limits. Achieving efficiency, ensuring portability, while preserving programming tractability on such hardware prompted the HPC community to design new, higher level paradigms while relying on runtime systems to maintain performance. However, the common weakness of these projects is to deeply tie applications to specific expert-only runtime system APIs. The OpenMP specification, which aims at providing common parallel programming means for shared-memory platforms, appears as a good candidate to address this issue thanks to the latest task-based constructs introduced in its revision 4.0. The goal of this paper is to assess the effectiveness and limits of this support for designing a high-performance numerical library, ScalFMM, implementing the fast multipole method (FMM) that we have deeply redesigned with respect to the most advanced features provided by OpenMP 4. We show that OpenMP 4 allows for significant performance improvements over previous OpenMP revisions on recent multicore processors and that extensions to the 4.0 standard allow for strongly improving the performance, bridging the gap with the very high performance that was so far reserved to expert-only runtime system APIs.

Treffer: Bridging the gap between OpenMP and task-based runtime systems for the fast multipole method

Weitere Informationen

Links

Zusatz-Funktionen