Application of Generative Large Language Models in Chinese Radiology Domain

CHEN Longfei, GAO Xin, HOU Haotian, YE Chuyang, LIU Ya'ou, ZHANG Meihui

doi:10.3778/j.issn.1673-9418.2406041

Jisuanji kexue yu tansuo (Sep 2024)

Application of Generative Large Language Models in Chinese Radiology Domain

CHEN Longfei, GAO Xin, HOU Haotian, YE Chuyang, LIU Ya'ou, ZHANG Meihui

Affiliations

CHEN Longfei, GAO Xin, HOU Haotian, YE Chuyang, LIU Ya'ou, ZHANG Meihui: 1. School of Computer Science & Technology, Beijing Institute of Technology, Beijing 100081, China 2. School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing 100081, China 3. Department of Radiology, Beijing Tiantan Hospital, Capital Medical University, Beijing 100070, China

DOI: https://doi.org/10.3778/j.issn.1673-9418.2406041
Journal volume & issue: Vol. 18, no. 9
pp. 2337 – 2348

Abstract

Read online

In the Chinese radiology domain, radiology reports serve as a crucial basis for clinical decision-making. Therefore, utilizing natural language processing (NLP) technology to understand and learn from the textual content of radiology reports, thereby aiding radiological clinical work, has become an important research direction in this domain. However, when dealing with the natural language classification and generation tasks based on Chinese radiology reports using traditional methods, there are still challenges such as a lack of training corpora, privacy concerns, and poor model generalization capabilities, leading to insufficient overall performance. To address these issues, a solution for natural language tasks in the Chinese radiology domain based on locally efficient fine-tuning large language models is proposed. By collecting and constructing a large-scale, high-quality dataset for natural language tasks in the Chinese radiology reports, and employing the LoRA efficient fine-tuning method for supervised fine-tuning training of the open-source large language model Baichuan2, the “RadGPT” capable of solving four types of clinical tasks in the Chinese radiology domain simultaneously is proposed. A set of evaluation systems for natural language classification and generation tasks in the Chinese radiology domain is introduced. Multiple sets of experiments are conducted on three types of radiology report datasets from two centers, and comparisons are made with several typical existing methods. The results demonstrate that the proposed method performs better in terms of classification performance, text summarization and expansion capabilities, and model generalization.

large language model; radiology report; text classification; text generation; efficient fine-tuning strategy

Published in Jisuanji kexue yu tansuo

ISSN: 1673-9418 (Print)
Publisher: Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press
Country of publisher: China
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://fcst.ceaj.org

About the journal

Abstract

Keywords