大数据 (Sep 2024)

Application of distributed techniques in large language model training and inference

  • ZHENG Weimin

Abstract

Read online

In recent years, artificial intelligence has been widely applied in multiple fields, and the "pre-training and fine-tuning" of large models (LLMs) has become the latest paradigm of artificial intelligence. Distributed technology exists at every stage of the lifecycle of LLMs, providing support for them. In the data acquisition process, the file system called "SuperFS", was developed to address the storage issue of massive small files, which can meet the requirements of low latency and scalability. In the data preprocessing stage, an efficient big data processing engine called "Chukonu" was developed to address the issue of high overhead in reading data from distributed file systems. In the model training stage, a distributed checkpoint strategy was proposed to address the problem of poor read and write performance of checkpoint files, greatly improving the read and write speed of checkpoint files. In the model inference stage, a high-throughput inference scheme called "FastDecode" and a LLM inference architecture called "Mooncake" were developed to address the challenge posed by KVCache to storage system. The applications of distributed technology enable LLMs to fully utilize computing resources, accelerate training speed, and benefit the development of the field of artificial intelligence.

Keywords