A snapshot of parallelism in distributed deep learning training

Hairol Romero-Sandí; Gabriel Núñez; Elvis Rojas

doi:10.29375/25392115.5054

Revista Colombiana de Computación (Jun 2024)

A snapshot of parallelism in distributed deep learning training

Hairol Romero-Sandí,
Gabriel Núñez,
Elvis Rojas

Affiliations

Hairol Romero-Sandí: Universidad Nacional
Gabriel Núñez: Universidad Nacional
Elvis Rojas: Universidad Nacional | National High Technology Center

DOI: https://doi.org/10.29375/25392115.5054
Journal volume & issue: Vol. 25, no. 1

Abstract

Read online

The accelerated development of applications related to artificial intelligence has generated the creation of increasingly complex neural network models with enormous amounts of parameters, currently reaching up to trillions of parameters. Therefore, it makes your training almost impossible without the parallelization of training. Parallelism applied with different approaches is the mechanism that has been used to solve the problem of training on a large scale. This paper presents a glimpse of the state of the art related to parallelism in deep learning training from multiple points of view. The topics of pipeline parallelism, hybrid parallelism, mixture-of-experts and auto-parallelism are addressed in this study, which currently play a leading role in scientific research related to this area. Finally, we develop a series of experiments with data parallelism and model parallelism. The objective is that the reader can observe the performance of two types of parallelism and understand more clearly the approach of each one.

Published in Revista Colombiana de Computación

ISSN: 1657-2831 (Print); 2539-2115 (Online)
Publisher: Universidad Autónoma de Bucaramanga
Country of publisher: Colombia
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://revistas.unab.edu.co/index.php/rcc/index

About the journal