Large language models and their applications in bioinformatics

Oluwafemi A. Sarumi; Dominik Heider

Computational and Structural Biotechnology Journal (Dec 2024)

Large language models and their applications in bioinformatics

Oluwafemi A. Sarumi,
Dominik Heider

Affiliations

Oluwafemi A. Sarumi: University of Münster, Institute of Medical Informatics, Albert-Schweitzer-Campus, Münster, 48149, Germany; Institute of Computer Science, Heinrich-Heine-University Duesseldorf, Graf-Adolf-Str. 63, Duesseldorf, 40215, Germany
Dominik Heider: University of Münster, Institute of Medical Informatics, Albert-Schweitzer-Campus, Münster, 48149, Germany; Institute of Computer Science, Heinrich-Heine-University Duesseldorf, Graf-Adolf-Str. 63, Duesseldorf, 40215, Germany; Corresponding author.

Journal volume & issue: Vol. 23
pp. 3498 – 3505

Abstract

Read online

Recent advancements in Natural Language Processing (NLP) have been significantly driven by the development of Large Language Models (LLMs), representing a substantial leap in language-based technology capabilities. These models, built on sophisticated deep learning architectures, typically transformers, are characterized by billions of parameters and extensive training data, enabling them to achieve high accuracy across various tasks. The transformer architecture of LLMs allows them to effectively handle context and sequential information, which is crucial for understanding and generating human language. Beyond traditional NLP applications, LLMs have shown significant promise in bioinformatics, transforming the field by addressing challenges associated with large and complex biological datasets. In genomics, proteomics, and personalized medicine, LLMs facilitate identifying patterns, predicting protein structures, or understanding genetic variations. This capability is crucial, e.g., for advancing drug discovery, where accurate prediction of molecular interactions is essential. This review discusses the current trends in LLMs research and their potential to revolutionize the field of bioinformatics and accelerate novel discoveries in the life sciences.

Published in Computational and Structural Biotechnology Journal

ISSN: 2001-0370 (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Technology: Chemical technology: Biotechnology
Website: https://www.journals.elsevier.com/computational-and-structural-biotechnology-journal

About the journal

Abstract

Keywords