ChronoChat

ChronoChat is a video-RAG platform built on top of Ollama that enables users to chat with video content using non vision/video-language models (VLMs). It supports both YouTube and local uploads and uses retrieval-augmented generation (RAG) to answer questions using video transcripts, frames, and captions. Powered by local LLMs, ChronoChat streams real-time responses with additional support for images and PDF uploads.

demo.mp4

Note

ChronoChat is ideal for:

✅ Interviews, tutorials, and educational content
❌ Not suited for animations or silent videos

⚠️ Requires a GPU for optimal performance

🏁 Getting Started

1. 📦 Set Up Python Environment

# Create and activate a virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

2. 🔨 Install Dependencies for ChronoChat

python cli.py install

3. ⚙️ Install PyTorch with CUDA (Recommended)

For GPU acceleration, install the CUDA-enabled version of PyTorch:
Visit https://pytorch.org/get-started/locally/ to get the correct command for your system.

💡 If you don’t have an NVIDIA GPU or don’t want CUDA, skip this step

4. 🎞️ Install FFmpeg

ChronoChat requires ffmpeg for processing video and audio.
Download from: https://ffmpeg.org/download.html

5. 🤖 Install Ollama

If you haven’t already, install Ollama

6. 🖥️ Start the Ollama Server

ollama serve

7. 🚀 Launch ChronoChat

python cli.py start

Then open your browser at: http://localhost:3000

✨ Key Features

🔍 Video RAG: Uses CLIP, Whisper, and BLIP embeddings for frame, audio, and caption-based retrieval.
🧠 LLM Planning: Models generate reasoning chains, plan actions, and adapt to single or multi-video chats.
🔌 Streaming Responses: Live WebSocket chat with markdown rendering and response progress updates.
🎥 Multi-Video Support: Search and reason across multiple videos in a single conversation.
📎 Attach Files: Supports uploading PDFs and images.

🧱 Architecture

---
config:
  look: handDrawn
  theme: neutral
---

graph TD
  subgraph "Frontend (Next.js)"
    Sidebar["📂 Chats & Videos"]
    UploadUI["📦 Upload videos"]
    ChatUI["💬 Chat interface"]
    APIClient["🔗 REST client"]
    WSClient["🔄 WebSocket client"]
  end

  subgraph "Backend (FastAPI & Async Worker)"
    ChatRouter["🗨️ Chat router"]
    MediaRouter["🎬 Media router"]
    VideoRAG["🧠 VideoRAG engine"]
    ContextExtractor["🔍 Context extractor"]
    Retriever["📦 ChromaDB retriever"]
    LLMClient["🤖 LLM client"]
    Worker["⚙️ Ingestion worker"]
    MediaDB["🗄️ ChromaDB"]
    MediaStorage["📁 Video and metadata storage"]
    VideoQueue["📮 Processing queue"]
  end

  Sidebar --> ChatUI
  UploadUI --> APIClient
  ChatUI -- "File upload" --> APIClient
  ChatUI <--"Text query" --> WSClient

  APIClient <--> MediaRouter
  WSClient <--> ChatRouter
  ChatRouter --> VideoRAG
  VideoRAG <-- "Video query" --> ContextExtractor
  VideoRAG <-- "Other query" --> LLMClient
  ContextExtractor <--> Retriever
  ContextExtractor <--> LLMClient
  Retriever <--> MediaDB

  MediaRouter --> MediaStorage
  MediaRouter --> VideoQueue
  VideoQueue --> Worker
  Worker --> MediaDB

⚙️ Tech Stack

Layer	Tools
Frontend	Next.js, TailwindCSS, Shadcn, TypeScript
Backend	FastAPI, AsyncIO, SQLite, ChromaDB
Embeddings	CLIP (frames), Whisper (audio), BLIP
LLM	Ollama
Storage	Local files, ChromaDB vectors, SQLite

🧠 How It Works

Ingest Video: Extracts audio, frames, and captions from YouTube/local videos.
Embed Content: Computes multimodal embeddings and stores them in ChromaDB.
Chat Interaction: LLM receives the user query and selects a retrieval mode.
RAG Flow: Relevant chunks are retrieved based on video context.
Response Streaming: Final output is streamed to the user in real time.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cli.py		cli.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChronoChat

🏁 Getting Started

1. 📦 Set Up Python Environment

2. 🔨 Install Dependencies for ChronoChat

3. ⚙️ Install PyTorch with CUDA (Recommended)

4. 🎞️ Install FFmpeg

5. 🤖 Install Ollama

6. 🖥️ Start the Ollama Server

7. 🚀 Launch ChronoChat

✨ Key Features

🧱 Architecture

⚙️ Tech Stack

🧠 How It Works

About

Uh oh!

Releases

Packages

Languages

License

T0mLam/chrono-chat

Folders and files

Latest commit

History

Repository files navigation

ChronoChat

🏁 Getting Started

1. 📦 Set Up Python Environment

2. 🔨 Install Dependencies for ChronoChat

3. ⚙️ Install PyTorch with CUDA (Recommended)

4. 🎞️ Install FFmpeg

5. 🤖 Install Ollama

6. 🖥️ Start the Ollama Server

7. 🚀 Launch ChronoChat

✨ Key Features

🧱 Architecture

⚙️ Tech Stack

🧠 How It Works

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages