*Result*: Performance improvements of parallel applications thanks to MPI-4.0 Hints

Title:
Performance improvements of parallel applications thanks to MPI-4.0 Hints
Contributors:
Laboratoire d'Informatique en Calcul Intensif et Image pour la Simulation (LICIIS UR 3690 LRC DIGIT), Université de Reims Champagne-Ardenne (URCA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA), Laboratoire de Recherche Conventionné DIGIT (LRC CEA/URCA DIGIT), Université de Reims Champagne-Ardenne (URCA)-CEA/DAM Arpajon (CEA/DAM), Direction des Applications Militaires (DAM), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Direction des Applications Militaires (DAM), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA), Laboratoire en Informatique Haute Performance pour le Calcul et la simulation (LIHPC), DAM Île-de-France (DAM/DIF), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay, IEEE
Source:
IEEE SBAC-PAD 2022 - 34th International Symposium on Computer Architecture and High Performance Computing ; https://hal.science/hal-03793122 ; IEEE SBAC-PAD 2022 - 34th International Symposium on Computer Architecture and High Performance Computing, Nov 2022, Bordeaux, France. pp.273-282, ⟨10.1109/SBAC-PAD55451.2022.00038⟩ ; https://project.inria.fr/sbac2022/
Publisher Information:
CCSD
IEEE
Publication Year:
2022
Collection:
HAL-CEA (Commissariat à l'énergie atomique et aux énergies alternatives)
Subject Geographic:
Document Type:
*Conference* conference object
Language:
English
DOI:
10.1109/SBAC-PAD55451.2022.00038
Rights:
info:eu-repo/semantics/OpenAccess
Accession Number:
edsbas.37D6323E
Database:
BASE

*Further Information*

*International audience ; HPC systems have experienced significant growth over the past years, with modern machines having hundreds of thousands of nodes. Message Passing Interface (MPI) is the de facto standard for distributed computing on these architectures. On the MPI critical path, the message-matching process is one of the most time-consuming operations. In this process, searching for a specific request in a message queue represents a significant part of the communication latency. So far, no miracle algorithm performs well in all cases. This paper explores potential matching specializations thanks to hints introduced in the latest MPI 4.0 standard. We propose a hash-table-based algorithm that performs constant time message-matching for no wildcard requests. This approach is suitable for intensive point-to-point communication phases in many applications (more than 50% of CORAL benchmarks). We demonstrate that our approach can improve the overall execution time of real HPC applications by up to 25%. Also, we analyze the limitations of our method and propose a strategy for identifying the most suitable algorithm for a given application. Indeed, we apply machine learning techniques for classifying applications depending on their message pattern characteristics.*