-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Web Cloaking — Task Description
Description of the Web Cloaking component only, as outlined in the experiment diagram.
Objective
Implement footprint-based web cloaking to simulate malicious or persuasive web pages designed to influence LLM code agents.
These web pages act as external targets that agents may follow (retrieve) or quote (use content from) when encountering injected URLs in documents.
Core Concept
- Create controlled web endpoints that host short, crafted content pieces.
- Each endpoint visually and semantically mimics legitimate developer resources, tutorials, or documentation pages.
- The cloaking layer presents different views depending on who accesses it (human vs LLM agent vs crawler).
Input
target_urlor domain under experiment controlcontent_templatesdefining multiple “faces” of the same pageagent_identifiersor access fingerprints (headers, user-agent strings, referrers)
Output
- A set of web pages served dynamically through the cloaking system
- Logged access events for each request (agent vs control)
- Metrics for footprint detection and LLM response influence
Cloaking Mechanism
- Detection Layer: Identify request source by checking agent headers, tokens, or known LLM proxy signatures.
- Content Switching:
- Serve benign version for normal users or manual inspection.
- Serve crafted (inducing) version for detected LLM/agent requests.
- Footprint Injection: Insert subtle linguistic cues or patterns designed to exploit retrieval or summarization biases.
- Logging & Tracking: Record IP, headers, and version of page served for each hit.
Page Design
- Maintain normal appearance (e.g., README-style markdown or documentation layout).
- Include small trigger sections (keywords, instructions, or links) that LLMs are likely to summarize or reuse.
- Keep token length moderate (under 500 tokens) for better capture by retrieval embeddings.
- Optionally rotate templates per visit to test robustness of footprint detection.
Safety & Experiment Control
- Host all pages under a sandboxed internal domain.
- Disable any executable scripts, downloads, or forms.
- Record all content variations and timestamps.
- Restrict exposure to external networks — pages serve as controlled stimuli only.
Evaluation Metrics
- Follow rate: % of queries or agent runs that request the cloaked URL.
- Quote rate: % of LLM outputs containing material derived from cloaked content.
- Differential response: comparison between benign vs inducing page variants.
- Traceability: all requests logged with timestamp, URL, served version, and response size.
Summary
Web cloaking provides the external behavioral layer of the attack simulation: controlled, variant web pages that reveal whether LLM agents will access, interpret, or reproduce injected online content when exposed through poisoned documents.
Metadata
Metadata
Labels
enhancementNew feature or requestNew feature or request