IEEE Access (Jan 2024)

Repeatability of Fine-Tuning Large Language Models Illustrated Using QLoRA

  • Saeed S. Alahmari,
  • Lawrence O. Hall,
  • Peter R. Mouton,
  • Dmitry B. Goldgof

DOI
https://doi.org/10.1109/ACCESS.2024.3470850
Journal volume & issue
Vol. 12
pp. 153221 – 153231

Abstract

Read online

Large language models (LLMs) have shown progress and promise in diverse applications ranging from the medical field to chat bots. Developing LLMs requires a large corpus of data and significant computation resources to achieve efficient learning. Foundation models (in particular LLMs) serve as the basis for fine-tuning on a new corpus of data. Since the original foundation models contain a very large number of parameters, fine-tuning them can be quite challenging. Development of the low-rank adaption technique (LoRA) for fine-tuning, and the quantized version of LoRA, also known as QLoRA, allows for fine-tuning of LLMs on a new smaller corpus of data. This paper focuses on the repeatability of fine-tuning four LLMs using QLoRA. We have fine-tuned them for seven trials each under the same hardware and software settings. We also validated our study for the repeatability (stability) issue by fine-tuning LLMs on two public datasets. For each trial, each LLM was fine-tuned on a subset of the dataset and tested on a holdout test set. Fine-tuning and inference were done on a single GPU. Our study shows that fine-tuning of LLMs with the QLoRA method is not repeatable (not stable), such that different fine-tuned runs result in different performance on the holdout test set.

Keywords