*Result*: Distributed and Parallel Computing for very Large Neural Networks ; Calcul réparti et parallèle pour les réseaux de neurones de très grandes tailles

Title:
Distributed and Parallel Computing for very Large Neural Networks ; Calcul réparti et parallèle pour les réseaux de neurones de très grandes tailles
Authors:
Contributors:
Huawei Technologies France, Huawei Technologies France Boulogne-Billancourt, Maison de la Simulation (MDLS), Université de Versailles Saint-Quentin-en-Yvelines (UVSQ)-Institut National de Recherche en Informatique et en Automatique (Inria)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS), Université Paris-Saclay, Nahid Emad Petiton, Chong Li
Source:
https://hal.science/tel-05085486 ; Distributed, Parallel, and Cluster Computing [cs.DC]. Université Paris-Saclay, 2025. English. ⟨NNT : 2025UPASG002⟩.
Publisher Information:
CCSD
Publication Year:
2025
Document Type:
*Dissertation/ Thesis* doctoral or postdoctoral thesis
Language:
English
Relation:
NNT: 2025UPASG002
Rights:
https://about.hal.science/hal-authorisation-v1/ ; info:eu-repo/semantics/OpenAccess
Accession Number:
edsbas.8D0EE424
Database:
BASE

*Further Information*

*Very large model sizes are now a very common feature, extending the range of applications for Deep Learning. However, this exponential growth in model size has led to an equally significant increase in computing power requirements. Innovative solutions need to be found and implemented to optimize current algorithms, reduce their complexity and make them easy to use and deploy in a massively distributed environment. The development of parallel and distributed computing techniques and methods to fully exploit available resources is crucial to maximizing efficiency and minimizing computation costs is very important to meet the ever-growing requirements of these models.In this context, we propose several contributions to reduce the costs associated with the training of neural networks in a massively distributed environment. Our contributions focus on the processing of data upstream of the model, in order to improve the quality of the data supplied to the neural network and facilitate its training. We focused on the processing of sparse data, such as graphs, which pose particular challenges due to their complex structures and potentially very large sizes. The processing applied to these data are designed to significantly improve the model's performance. Finally, we propose leveraging this processing to reduce effectively the size of the data, thereby decreasing the number of inputs while retaining sufficient information to ensure good model accuracy. ; Les modèles de très grande taille sont aujourd'hui utilisés dans une multitude de domaines variés et ont permis de généraliser et de populariser l'utilisation du Deep Learning pour de nouvelles applications. Cependant, le traitement de ces tâches toujours plus générales a demandé une augmentation exponentielle de la taille de ces modèles, ce qui a également nécessité une puissance de calcul tout aussi importante pour les entraîner. Des solutions innovantes doivent être trouvées et déployées pour à la fois réduire la complexité des algorithmes existants et améliorer le ...*