Conversation Attachments

This document describes how conversation attachments work in the Conversations application, including the upload process, security measures, and how documents are processed for use with Large Language Models (LLMs).

Overview
Supported Attachment Types
Architecture & Flow
- High-Level Overview
- Detailed Technical Flow
Security & Validation
- MIME Type Validation
- Malware Detection
Document Processing for LLMs
Configuration

Overview

Conversations allows users to attach files to their conversations with the AI assistant. These attachments can be:

Images (displayed directly to vision-capable LLMs)
PDF documents (sent as document URLs to the LLM)
Other documents (converted to text and indexed for semantic search)

The attachment system uses S3-compatible object storage (such as MinIO in development) to store files securely. The backend generates presigned URLs that allow the frontend to upload files directly to the storage, without routing the file data through the backend server.

Note about documents: The system uses a tool called MarkItDown to convert various document formats (Word, Excel, PowerPoint, text files, etc.) into Markdown text for processing by LLMs. When at least one non-PDF/image document is attached, the system enables:

a Retrieval-Augmented Generation (RAG) search tool to allow the LLM to query relevant sections of the documents.
a summarization tool to provide document summaries on user request. ⚠️ naive implementation at the moment, needs improvement before being used in production.

Supported Attachment Types

The following attachment types are supported:

Images: image/png, image/jpeg, image/gif, image/webp.
PDF documents: application/pdf
Other documents:
- Microsoft Word: application/vnd.openxmlformats-officedocument.wordprocessingml.document
- Microsoft Excel: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
- Microsoft PowerPoint: application/vnd.openxmlformats-officedocument.presentationml.presentation
- Text files: text/plain, text/markdown, text/csv

Warning: The current implementation for PDF expects the LLM to be able to manage them. We need to improve the handling of PDFs in case the LLM cannot process them natively.

Todo:

Add support for more file types and improve document processing workflows.
Allow PDF management via RAG search when the LLM cannot handle them natively.
Allow file type restrictions based on model settings, instead of globally.
Improve the summarization tool to provide better summaries and handle larger documents.
Start file upload right away when the user selects a file, instead of waiting for the user to send the message.

Architecture & Flow

High-Level Overview

┌─────────────┐       ┌─────────────┐       ┌─────────────┐       ┌─────────────┐
│   Frontend  │       │   Backend   │       │  S3 Storage │       │ Malware Det.│
└──────┬──────┘       └──────┬──────┘       └──────┬──────┘       └──────┬──────┘
       │                     │                     │                     │
       │ 1. Create attachment│                     │                     │
       ├────────────────────>│                     │                     │
       │                     │                     │                     │
       │ 2. Return presigned │                     │                     │
       │    URL for upload   │                     │                     │
       │<────────────────────┤                     │                     │
       │                     │                     │                     │
       │ 3. Upload file      │                     │                     │
       │    directly to S3   │                     │                     │
       ├──────────────────────────────────────────>│                     │
       │                     │                     │                     │
       │ 4. Notify upload    │                     │                     │
       │    completed        │                     │                     │
       ├────────────────────>│                     │                     │
       │                     │                     │                     │
       │                     │ 5. Detect MIME type │                     │
       │                     ├────────────────────>│                     │
       │                     │                     │                     │
       │                     │ 6. Scan for malware │                     │
       │                     ├──────────────────────────────────────────>│
       │                     │                     │                     │
       │                     │ 7. Update status    │                     │
       │ 8. Return status    │<──────────────────────────────────────────┤
       │<────────────────────┤                     │                     │
       │                     │                     │                     │

Detailed Technical Flow

Step 1: Attachment Creation Request

When a user selects a file to upload, the frontend sends a POST request to create an attachment record:

Endpoint: POST /api/conversations/{conversation_id}/attachments/

Request payload:

{
  "file_name": "document.pdf",
  "size": 1048576,
  "content_type": "application/pdf"
}

Backend processing (ChatConversationAttachmentViewSet.perform_create):

Verifies the user owns the conversation
Generates a unique UUID for the file
Creates a storage key: {conversation_id}/attachments/{uuid}.{extension}
Creates a database record with status PENDING

Response:

{
  "id": "uuid-of-attachment",
  "key": "conversation-id/attachments/file-id.pdf",
  "file_name": "document.pdf",
  "size": 1048576,
  "upload_state": "pending",
  "policy": "https://s3.example.com/bucket/...?presigned-params"
}

The policy field contains a presigned URL valid for a limited time (configured by AWS_S3_UPLOAD_POLICY_EXPIRATION).

Step 2: Direct Upload to S3

The frontend uses the presigned URL to upload the file directly to S3 storage using a PUT request.

Technical details:

The presigned URL includes authentication parameters
The upload is done with Content-Type header matching the file's MIME type
No backend involvement in the data transfer

Step 3: Upload Completion Notification

After successful upload, the frontend notifies the backend:

Endpoint: POST /api/conversations/{conversation_id}/attachments/{attachment_id}/upload-ended/

Backend processing (ChatConversationAttachmentViewSet.upload_ended):

MIME Type Detection (chat/views.py):

mime_detector = magic.Magic(mime=True)
with default_storage.open(attachment.key, "rb") as file:
    mimetype = mime_detector.from_buffer(file.read(2048))
    size = file.size

Uses python-magic to detect the actual MIME type from file content (first 2048 bytes).

Update attachment status:
- Status: PENDING → ANALYZING
- Store detected MIME type and actual file size

Trigger Malware Detection:

malware_detection.analyse_file(
    attachment.key,
    safe_callback="chat.malware_detection.conversation_safe_attachment_callback",
    unknown_callback="chat.malware_detection.unknown_attachment_callback",
    unsafe_callback="chat.malware_detection.conversation_unsafe_attachment_callback",
    conversation_id=conversation_id,
)

Step 4: Malware Detection Callbacks

The malware detection service (configurable via MALWARE_DETECTION_BACKEND) scans the file and calls one of three callbacks:

Safe file (conversation_safe_attachment_callback):

Status: ANALYZING → READY
File is ready for use

Unsafe file (conversation_unsafe_attachment_callback):

Status: ANALYZING → SUSPICIOUS
File is quarantined and not accessible
Security log entry created

Unknown status (unknown_attachment_callback):

Handles special cases (e.g., file too large to analyze)
Status: ANALYZING → FILE_TOO_LARGE_TO_ANALYZE

Security & Validation

For now, the system is not intended to host user-uploaded files for public download. All files are stored in private S3 buckets with presigned URLs for controlled access and only the owner of the conversation/the uploader can access them, so the risk is quite low around bad use of the attachment system.

Also, the document content is sent to the LLM and does not prevent any prompt injection attacks, which is not an issue specific to the attachment system but to the overall design of LLM-based applications and should be addressed globally. Also for the moment, the system does not have any action tools that could be used to execute malicious code based on document content.

Malware Detection

The malware detection system is pluggable and configurable, allowing different backends to be used. By default, a DummyBackend is provided that marks all files as safe.

⚠️ The current implementation does not disallow any file types or status from being used in conversations. This is a potential security risk and should be addressed in future versions.

Document Processing for LLMs

When a user sends a message with attachments, the system processes them differently based on their type:

Image Attachments

MIME types: image/png, image/jpeg, image/gif, image/webp, etc.

Processing flow:

URL Conversion: Local media URLs are converted to presigned S3 URLs before sending to the LLM:

# From: chat/agents/local_media_url_processors.py
content.url = generate_retrieve_policy(key)

Sent to LLM: Images are sent as ImageUrl objects in the prompt:

ImageUrl(
    url="https://s3.example.com/bucket/key?presigned-params",
    identifier="file-id.png",
)

Vision models can analyze the image content directly.
Response processing: After the LLM responds, presigned URLs are converted back to local URLs for storage:
```
# Mapping: presigned_url -> /media-key/{conversation_id}/attachments/{file_id}.png
```

PDF Documents

MIME type: application/pdf

Processing flow:

Direct URL passing: PDFs are sent as DocumentUrl objects :

DocumentUrl(
    url="https://s3.example.com/bucket/key?presigned-params",
    identifier="file-id.pdf",
)

LLM processing: Compatible LLMs can:
- Extract and read text from PDFs
- Understand document structure
- Answer questions about the content
No conversion needed: PDFs are passed directly without preprocessing.

Processing Strategy Decision Tree

Decision logic:

No documents: Standard conversation
Images: Send as direct (presigned) URLs to the LLM
Only PDFs: Send as direct (presigned) URLs to the LLM
Other documents present: Enable RAG search tool + convert to Markdown

Configuration

Environment Variables

Variable	Default	Description
`ATTACHMENT_MAX_SIZE`	Configurable	Maximum file size in bytes
`ATTACHMENT_CHECK_UNSAFE_MIME_TYPES_ENABLED`	`True`	Enable/disable MIME type validation
`AWS_S3_UPLOAD_POLICY_EXPIRATION`	3600	Presigned URL expiration (seconds)
`AWS_S3_RETRIEVE_POLICY_EXPIRATION`	3600	Presigned retrieval URL expiration (seconds)
`AWS_S3_DOMAIN_REPLACE`	None	Alternative S3 domain for presigned URLs (for development)
`MALWARE_DETECTION_BACKEND`	`DummyBackend`	Malware scanning backend class
`MALWARE_DETECTION_PARAMETERS`	`{}`	Backend-specific configuration
`RAG_FILES_ACCEPTED_FORMATS`	See below	List of MIME types accepted for file uploads

RAG_FILES_ACCEPTED_FORMATS

This environment variable controls which file types users are allowed to upload as attachments to conversations.

Configuration:

Type: List of strings (comma-separated MIME types when using environment variable)
Default value: Includes a comprehensive list of document and image formats:
- Microsoft Office documents (.docx, .pptx, .xlsx, .xls)
- Text files (.txt, .csv)
- PDF documents (.pdf)
- HTML files
- Markdown files (.md)
- Outlook messages (.msg)
- Images (.jpeg, .png, .gif, .webp)

Example configuration:

# In environment variable (comma-separated)
RAG_FILES_ACCEPTED_FORMATS="application/pdf,text/plain,image/png,image/jpeg"

# In Django settings (as a Python list)
RAG_FILES_ACCEPTED_FORMATS = [
    "application/pdf",
    "text/plain",
    "image/png",
    "image/jpeg",
]

How it's used:

Backend: The list is exposed via the /api/v1.0/config/ endpoint as chat_upload_accept (MIME types joined with commas)
Frontend: The configuration is used to validate files before upload in the chat interface:
- Checks exact MIME type matches
- Supports wildcard patterns (e.g., image/* for all image types)
- Supports file extension patterns (e.g., .pdf)
User experience: Files that don't match the accepted formats are rejected with a user-friendly error message

Notes:

This setting controls frontend validation only. Backend validation should also be implemented for security.
Future improvements may include per-model file type restrictions.

Storage Configuration

MinIO (Development):

# docker-compose.yml
minio:
  image: minio/minio
  environment:
    MINIO_ROOT_USER: minioadmin
    MINIO_ROOT_PASSWORD: minioadmin
  command: server /data --console-address ":9001"

Troubleshooting

LLM Cannot Access Image/PDF

Possible causes:

Presigned URL has expired
S3 storage is not accessible from the LLM provider
CORS configuration issues

Solution: Check AWS_S3_RETRIEVE_POLICY_EXPIRATION and S3 access policies.

Document Not Appearing in RAG Search

Possible causes:

Document conversion failed
Vector database indexing failed

Check logs: Look for errors in DocumentConverter and RAG backend logs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conversation Attachments

Table of Contents

Overview

Supported Attachment Types

Architecture & Flow

High-Level Overview

Detailed Technical Flow

Step 1: Attachment Creation Request

Step 2: Direct Upload to S3

Step 3: Upload Completion Notification

Step 4: Malware Detection Callbacks

Security & Validation

Malware Detection

Document Processing for LLMs

Image Attachments

PDF Documents

Other Document Types

Processing Strategy Decision Tree

Configuration

Environment Variables

RAG_FILES_ACCEPTED_FORMATS

Storage Configuration

Troubleshooting

LLM Cannot Access Image/PDF

Document Not Appearing in RAG Search

Related Documentation

FilesExpand file tree

attachments.md

Latest commit

History

attachments.md

File metadata and controls

Conversation Attachments

Table of Contents

Overview

Supported Attachment Types

Architecture & Flow

High-Level Overview

Detailed Technical Flow

Step 1: Attachment Creation Request

Step 2: Direct Upload to S3

Step 3: Upload Completion Notification

Step 4: Malware Detection Callbacks

Security & Validation

Malware Detection

Document Processing for LLMs

Image Attachments

PDF Documents

Other Document Types

Processing Strategy Decision Tree

Configuration

Environment Variables

RAG_FILES_ACCEPTED_FORMATS

Storage Configuration

Troubleshooting

LLM Cannot Access Image/PDF

Document Not Appearing in RAG Search

Related Documentation