Unified Training for Cross-Lingual Abstractive Summarization by Aligning Parallel Machine Translation Pairs

Shaohuan Cheng; Wenyu Chen; Yujia Tang; Mingsheng Fu; Hong Qu

doi:10.3390/math12132107

Mathematics (Jul 2024)

Unified Training for Cross-Lingual Abstractive Summarization by Aligning Parallel Machine Translation Pairs

Shaohuan Cheng,
Wenyu Chen,
Yujia Tang,
Mingsheng Fu,
Hong Qu

Affiliations

Shaohuan Cheng: School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
Wenyu Chen: School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
Yujia Tang: School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
Mingsheng Fu: School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
Hong Qu: School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China

DOI: https://doi.org/10.3390/math12132107
Journal volume & issue: Vol. 12, no. 13
p. 2107

Abstract

Read online

Cross-lingual summarization (CLS) is essential for enhancing global communication by facilitating efficient information exchange across different languages. However, owing to the scarcity of CLS data, recent studies have employed multi-task frameworks to combine parallel monolingual summaries. These methods often use independent decoders or models with non-shared parameters because of the mismatch in output languages, which limits the transfer of knowledge between CLS and its parallel data. To address this issue, we propose a unified training method for CLS that combines parallel machine translation (MT) pairs with CLS pairs, jointly training them within a single model. This design ensures consistent input and output languages and promotes knowledge sharing between the two tasks. To further enhance the model’s capability to focus on key information, we introduce two additional loss terms to align the hidden representations and probability distributions between the parallel MT and CLS pairs. Experimental results demonstrate that our method outperforms competitive methods in both full-dataset and low-resource scenarios on two benchmark datasets, Zh2EnSum and En2ZhSum.

Published in Mathematics

ISSN: 2227-7390 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/mathematics

About the journal

Abstract

Keywords