Django RAG Chatbot System

A production-ready Django-based RAG (Retrieval-Augmented Generation) chatbot system that enables users to create chatbots powered by document knowledge bases. The system uses PostgreSQL with pgvector for vector storage and OpenAI for embeddings and language generation.

Features

Chatbot Management: Create chatbots with document knowledge bases
Document Processing: Support for PDF, TXT, and DOCX files
RAG System: Semantic search using vector embeddings with pgvector
User Authentication: Token-based authentication with minimal friction
Chat History: Isolated chat sessions with complete history tracking
RESTful API: Comprehensive API for all operations
Transaction Safety: Atomic operations with automatic rollback on failures

Architecture

The system follows a clean architecture with three main layers:

API Layer: Django REST Framework endpoints with validation
Service Layer: Business logic and transaction management
Data Layer: PostgreSQL with pgvector for relational and vector data

Prerequisites

Python 3.10+
PostgreSQL 14+ with pgvector extension
OpenAI API key

Installation

1. Clone the Repository

git clone <repository-url>
cd <project-directory>

2. Set Up Python Environment

# Create virtual environment
python -m venv .venv

# Activate virtual environment
# On Linux/Mac:
source .venv/bin/activate
# On Windows:
.venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

3. Set Up PostgreSQL with pgvector

# Install PostgreSQL (if not already installed)
# On Ubuntu/Debian:
sudo apt-get install postgresql postgresql-contrib

# On macOS with Homebrew:
brew install postgresql

# Install pgvector extension
# Follow instructions at: https://github.com/pgvector/pgvector

# Create database
sudo -u postgres psql
CREATE DATABASE django_rag_db;
CREATE USER your_user WITH PASSWORD 'your_password';
GRANT ALL PRIVILEGES ON DATABASE django_rag_db TO your_user;
\q

# Enable pgvector extension
psql -U your_user -d django_rag_db
CREATE EXTENSION vector;
\q

4. Configure Environment Variables

# Copy example environment file
cp .env.example .env

# Edit .env and update with your values
# Required variables:
# - DJANGO_SECRET_KEY
# - DATABASE_URL or DB_* variables
# - OPENAI_API_KEY

5. Run Database Migrations

python manage.py migrate

6. Create Media Directory

mkdir -p media/documents

7. Run the Development Server

python manage.py runserver

The API will be available at http://localhost:8000/api/

Configuration

Environment Variables

See .env.example for all available configuration options. Key variables:

Variable	Description	Required	Default
`DJANGO_SECRET_KEY`	Django secret key for cryptographic signing	Yes	-
`DJANGO_DEBUG`	Enable debug mode (set to False in production)	No	True
`DATABASE_URL`	PostgreSQL connection string	Yes	-
`OPENAI_API_KEY`	OpenAI API key for embeddings and LLM	Yes	-
`CHUNK_SIZE`	Text chunk size for document processing	No	1000
`CHUNK_OVERLAP`	Overlap between text chunks	No	200
`MAX_UPLOAD_SIZE`	Maximum file upload size in bytes	No	10485760
`TOKEN_LENGTH`	Length of generated authentication tokens	No	32

File Upload Limits

Maximum file size: 10MB (configurable via MAX_UPLOAD_SIZE)
Supported formats: PDF, TXT, DOCX

API Documentation

See API_DOCS.md for complete API documentation including:

Endpoint descriptions
Request/response formats
Authentication requirements
Error codes and handling
Example requests

Testing

Run All Tests

pytest

Run Specific Test Categories

# Unit tests only
pytest -m "not integration"

# Integration tests only
pytest -m integration

# Property-based tests
pytest -k "property"

Test Coverage

pytest --cov=apps --cov=rag_integration --cov-report=html

Deployment

See DEPLOYMENT.md for detailed deployment instructions including:

Production configuration
Database setup and optimization
Security considerations
Scaling strategies
Monitoring and logging

Quick Production Checklist

Project Structure

project/
├── apps/
│   ├── authentication/      # User authentication and token management
│   ├── chatbots/           # Chatbot and document management
│   ├── chats/              # Chat messages and history
│   └── core/               # Shared utilities and document processing
├── rag_integration/        # RAG service and embeddings
├── config/                 # Django settings and configuration
├── media/                  # Uploaded documents
├── logs/                   # Application logs
├── .env                    # Environment variables (not in git)
├── .env.example           # Example environment configuration
├── requirements.txt       # Python dependencies
└── manage.py              # Django management script

Development

Code Style

This project follows PEP 8 style guidelines. Format code with:

black .
isort .

Adding New Dependencies

pip install <package>
pip freeze > requirements.txt

Database Migrations

# Create new migration
python manage.py makemigrations

# Apply migrations
python manage.py migrate

# Show migration status
python manage.py showmigrations

Troubleshooting

pgvector Extension Not Found

# Ensure pgvector is installed and enabled
psql -U your_user -d django_rag_db -c "CREATE EXTENSION IF NOT EXISTS vector;"

OpenAI API Errors

Verify your API key is correct in .env
Check your OpenAI account has available credits
Ensure you're using a supported model

File Upload Errors

Check MEDIA_ROOT directory exists and is writable
Verify file size is under MAX_UPLOAD_SIZE
Ensure file format is PDF, TXT, or DOCX

Database Connection Issues

Verify PostgreSQL is running
Check database credentials in .env
Ensure database exists and user has proper permissions

Contributing

Create a feature branch
Make your changes
Write tests for new functionality
Ensure all tests pass
Submit a pull request

License

See LICENSE file for details.

Support

For issues and questions:

Check the API Documentation
Review Deployment Guide
Open an issue on GitHub

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
apps		apps
config		config
.env.example		.env.example
.gitignore		.gitignore
API_DOCS.md		API_DOCS.md
DEPLOYMENT.md		DEPLOYMENT.md
README.md		README.md
docker-compose.yml		docker-compose.yml
manage.py		manage.py
pytest.ini		pytest.ini
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation