On protecting the data privacy of Large Language Models (LLMs) and LLM agents: A literature review

Biwei Yan; Kun Li; Minghui Xu; Yueyan Dong; Yue Zhang; Zhaochun Ren; Xiuzhen Cheng

doi:10.1016/j.hcc.2025.100300

High-Confidence Computing (Jun 2025)

On protecting the data privacy of Large Language Models (LLMs) and LLM agents: A literature review

Biwei Yan,
Kun Li,
Minghui Xu,
Yueyan Dong,
Yue Zhang,
Zhaochun Ren,
Xiuzhen Cheng

Affiliations

Biwei Yan: School Computer Science and Technology, Shandong University, Qingdao 266237, China
Kun Li: School Computer Science and Technology, Shandong University, Qingdao 266237, China
Minghui Xu: School Computer Science and Technology, Shandong University, Qingdao 266237, China; Corresponding authors.
Yueyan Dong: School Computer Science and Technology, Shandong University, Qingdao 266237, China
Yue Zhang: Department of Computer Science, Drexel University, Philadelphia 19104, USA; Corresponding authors.
Zhaochun Ren: Leiden Inst of Advanced Computer Science, Leiden University, Leiden 2333 CC, Netherlands
Xiuzhen Cheng: School Computer Science and Technology, Shandong University, Qingdao 266237, China

DOI: https://doi.org/10.1016/j.hcc.2025.100300
Journal volume & issue: Vol. 5, no. 2
p. 100300

Abstract

Read online

Large Language Models (LLMs) are complex artificial intelligence systems, which can understand, generate, and translate human languages. By analyzing large amounts of textual data, these models learn language patterns to perform tasks such as writing, conversation, and summarization. Agents built on LLMs (LLM agents) further extend these capabilities, allowing them to process user interactions and perform complex operations in diverse task environments. However, during the processing and generation of massive data, LLMs and LLM agents pose a risk of sensitive information leakage, potentially threatening data privacy. This paper aims to demonstrate data privacy issues associated with LLMs and LLM agents to facilitate a comprehensive understanding. Specifically, we conduct an in-depth survey about privacy threats, encompassing passive privacy leakage and active privacy attacks. Subsequently, we introduce the privacy protection mechanisms employed by LLMs and LLM agents and provide a detailed analysis of their effectiveness. Finally, we explore the privacy protection challenges for LLMs and LLM agents as well as outline potential directions for future developments in this domain.

Published in High-Confidence Computing

ISSN: 2667-2952 (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.journals.elsevier.com/high-confidence-computing

About the journal

Abstract

Keywords