Skip to content

darshjain/LLMPrivacyAttack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Privacy Project

Overview

Large Language Models (LLMs) are increasingly being deployed as autonomous agents capable of performing complex tasks with minimal human supervision. These LLM agents often have access to sensitive user data, corporate information, and other private content.

Problem Statement

Unlike traditional machine learning systems with well-defined input/output boundaries, LLM agents maintain conversation state and can be induced to reveal information through sophisticated interaction patterns. Privacy concerns around LLMs have primarily focused on training data extraction, but less attention has been paid to privacy vulnerabilities during deployment as agents that interact with sensitive data sources [Li et al., 2024; Wang et al., 2025].

Research Goals

This research project aims to fill this gap by:

  • Investigating novel privacy attack vectors specific to LLM agents.
  • Developing robust defensive mechanisms to mitigate privacy risks during deployment.

Motivation

The increasing adoption of LLM agents across industries handling sensitive information, such as:

  • Healthcare
  • Finance
  • Legal services

makes this research timely and critical.

Context and Impact

Recent incidents have demonstrated that even commercial LLM systems can be manipulated to reveal private information through carefully crafted prompts. As these systems gain broader access privileges and operate with greater autonomy, the privacy implications grow more significant.

About

Privacy analysis of deployed LLM agents, focusing on interaction-based leakage attacks and defensive mechanisms.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages