Contrastive Learning Penalized Cross-Entropy with Diversity Contrastive Search Decoding for Diagnostic Report Generation of Reduced Token Repetition

Taozheng Zhang; Jiajian Meng; Yuseng Yang; Shaode Yu

doi:10.3390/app14072817

Applied Sciences (Mar 2024)

Contrastive Learning Penalized Cross-Entropy with Diversity Contrastive Search Decoding for Diagnostic Report Generation of Reduced Token Repetition

Taozheng Zhang,
Jiajian Meng,
Yuseng Yang,
Shaode Yu

Affiliations

Taozheng Zhang: State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100024, China
Jiajian Meng: School of Information and Communication Engineering, Communication University of China, Beijing 100024, China
Yuseng Yang: School of Information and Communication Engineering, Communication University of China, Beijing 100024, China
Shaode Yu: State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100024, China

DOI: https://doi.org/10.3390/app14072817
Journal volume & issue: Vol. 14, no. 7
p. 2817

Abstract

Read online

Medical imaging description and disease diagnosis are vitally important yet time-consuming. Automated diagnosis report generation (DRG) from medical imaging description can reduce clinicians’ workload and improve their routine efficiency. To address this natural language generation task, fine-tuning a pre-trained large language model (LLM) is cost-effective and indispensable, and its success has been witnessed in many downstream applications. However, semantic inconsistency of sentence embeddings has been massively observed from undesirable repetitions or unnaturalness in text generation. To address the underlying issue of anisotropic distribution of token representation, in this study, a contrastive learning penalized cross-entropy (CLpCE) objective function is implemented to enhance the semantic consistency and accuracy of token representation by guiding the fine-tuning procedure towards a specific task. Furthermore, to improve the diversity of token generation in text summarization and to prevent sampling from unreliable tail of token distributions, a diversity contrastive search (DCS) decoding method is designed for restricting the report generation derived from a probable candidate set with maintained semantic coherence. Furthermore, a novel metric named the maximum of token repetition ratio (maxTRR) is proposed to estimate the token diversity and to help determine the candidate output. Based on the LLM of a generative pre-trained Transformer 2 (GPT-2) of Chinese version, the proposed CLpCE with DCS (CLpCEwDCS) decoding framework is validated on 30,000 desensitized text samples from the “Medical Imaging Diagnosis Report Generation” track of 2023 Global Artificial Intelligence Technology Innovation Competition. Using four kinds of metrics evaluated from n-gram word matching, semantic relevance, and content similarity as well as the maxTRR metric extensive experiments reveal that the proposed framework effectively maintains semantic coherence and accuracy (BLEU-1, 0.4937; BLEU-2, 0.4107; BLEU-3, 0.3461; BLEU-4, 0.2933; METEOR, 0.2612; ROUGE, 0.5182; CIDER, 1.4339) and improves text generation diversity and naturalness (maxTRR, 0.12). The phenomenon of dull or repetitive text generation is common when fine-tuning pre-trained LLMs for natural language processing applications. This study might shed some light on relieving this issue by developing comprehensive strategies to enhance semantic coherence, accuracy and diversity of sentence embeddings.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords