High-Confidence Computing (Jun 2025)

On protecting the data privacy of Large Language Models (LLMs) and LLM agents: A literature review

  • Biwei Yan,
  • Kun Li,
  • Minghui Xu,
  • Yueyan Dong,
  • Yue Zhang,
  • Zhaochun Ren,
  • Xiuzhen Cheng

DOI
https://doi.org/10.1016/j.hcc.2025.100300
Journal volume & issue
Vol. 5, no. 2
p. 100300

Abstract

Read online

Large Language Models (LLMs) are complex artificial intelligence systems, which can understand, generate, and translate human languages. By analyzing large amounts of textual data, these models learn language patterns to perform tasks such as writing, conversation, and summarization. Agents built on LLMs (LLM agents) further extend these capabilities, allowing them to process user interactions and perform complex operations in diverse task environments. However, during the processing and generation of massive data, LLMs and LLM agents pose a risk of sensitive information leakage, potentially threatening data privacy. This paper aims to demonstrate data privacy issues associated with LLMs and LLM agents to facilitate a comprehensive understanding. Specifically, we conduct an in-depth survey about privacy threats, encompassing passive privacy leakage and active privacy attacks. Subsequently, we introduce the privacy protection mechanisms employed by LLMs and LLM agents and provide a detailed analysis of their effectiveness. Finally, we explore the privacy protection challenges for LLMs and LLM agents as well as outline potential directions for future developments in this domain.

Keywords