*Result*: Parallel distributed productivity-aware tree-search using Chapel

Title:
Parallel distributed productivity-aware tree-search using Chapel
Source:
Concurrency and Computation: Practice and Experience, 35 (27) (2023-07-19)
Publisher Information:
John Wiley and Sons Ltd
Publication Year:
2023
Collection:
University of Luxembourg: ORBilu - Open Repository and Bibliography
Document Type:
*Academic Journal* article in journal/newspaper
Language:
English
ISSN:
1532-0626
1532-0634
Relation:
urn:issn:1532-0626; urn:issn:1532-0634; https://orbilu.uni.lu/handle/10993/58922; info:hdl:10993/58922; https://orbilu.uni.lu/bitstream/10993/58922/1/Helbecque_et_al_CCPE.pdf; wos:001031509900001
DOI:
10.1002/cpe.7874
Rights:
open access ; http://purl.org/coar/access_right/c_abf2 ; info:eu-repo/semantics/openAccess
Accession Number:
edsbas.8FFEF9EB
Database:
BASE

*Further Information*

*peer reviewed ; With the recent arrival of the exascale era, modern supercomputers are increasingly big making their programming much more complex. In addition to performance, software productivity is a major concern to choose a programming language, such as Chapel, designed for exascale computing. In this paper, we investigate the design of a parallel distributed tree-search algorithm, namely P3D-DFS, and its implementation using Chapel. The design is based on the Chapel's DistBag data structure, revisited by: (1) redefining the data structure for Depth-First tree-Search (DFS), henceforth renamed DistBag-DFS; (2) redesigning the underlying load balancing mechanism. In addition, we propose two instantiations of P3D-DFS considering the Branch-and-Bound (B&B) and Unbalanced Tree Search (UTS) algorithms. In order to evaluate how much performance is traded for productivity, we compare the Chapel-based implementations of B&B and UTS to their best-known counterparts based on traditional OpenMP (intra-node) and MPI+X (inter-node). For experimental validation using 4096 processing cores, we consider the permutation flow-shop scheduling problem for B&B and synthetic literature benchmarks for UTS. The reported results show that P3D-DFS competes with its OpenMP baselines for coarser-grained shared-memory scenarios, and with its MPI+X counterparts for distributed-memory settings, considering both performance and productivity-awareness. In the context of this work, this makes Chapel an alternative to OpenMP/MPI+X for exascale programming. ; Calcul ultra-scale pour la résolution de problèmes d'optimisation de grande taille*