Ask Junior

A RAG (Retrieval-Augmented Generation) system that provides an intelligent conversational AI interface with full observability, automated knowledge base ingestion, and enterprise-grade infrastructure.

Ask Junior is built as a microservices architecture with five core services that work together to deliver an AI knowledge base assistant.

Services

Service	Description	Port	Technology
agent/	Conversational AI chat interface with RAG	8000	Chainlit, OpenAI
monitor/	Full observability stack	3000	Grafana, Prometheus, Loki, Tempo
integrations/	ETL & RAG pipeline for document processing	8080	Apache Airflow
traefik/	Reverse proxy and load balancer	80/443	Traefik v3.0
vector_database/	Semantic search engine	8081	Weaviate

Data Flow

Knowledge Ingestion: Documents from Azure DevOps or local files are processed by Airflow
Vectorization: Documents are chunked, embedded via OpenAI, and stored in Weaviate
User Query: User asks a question through the Chainlit interface
Semantic Search: Weaviate retrieves relevant document chunks
Response Generation: OpenAI GPT-4 generates a response using retrieved context
Observability: All operations are traced, logged, and metrified

Quick Start

Prerequisites

Docker and Docker Compose
OpenAI API key
16GB RAM minimum
4 CPU cores recommended

1. Clone the Repository

git clone https://github.com/lorenzouriel/ask-junior.git
cd ask-junior

2. Start Services (Use the guide inside each service)

Start services in the following order:

# 1. Vector Database (required first)
cd vector_database && docker compose up -d --build

# 2. Monitoring Stack
cd ../monitor && docker compose up -d --build

# 3. Integrations (Airflow)
cd ../integrations && docker compose up -d --build

# 4. Agent
cd ../agent && docker compose up -d --build

# 5. Traefik (optional, for routing)
cd ../traefik && docker compose up -d --build

3. Access Services

Service	URL	Credentials
Agent Chat	http://localhost:8000 or http://agent.local	-
Airflow	http://localhost:8080 or http://airflow.local	admin / admin
Grafana	http://localhost:3000 or http://grafana.local	admin / admin
Prometheus	http://localhost:9090	-
Weaviate	http://localhost:8081	API Key: `adminkey`
Traefik Dashboard	http://localhost:8888	-

Network Architecture

Each service creates its own Docker network, with Traefik connecting to all of them:

--------------------------------------------------------------
|  Docker Host                                               |
|                                                            |
|  ---------------------  ---------------------              |
|  | ask-junior-network|  | airflow-network   |              |
|  | -- agent          |  | -- postgres       |              |
|  ---------------------  | -- redis          |              |
|                         | -- scheduler      |              |
|  ---------------------  | -- worker         |              |
|  | monitoring        |  ---------------------              |
|  | -- otel-collector |                                     |
|  | -- prometheus     |  ---------------------              |
|  | -- loki           |  | weaviate          |              |
|  | -- tempo          |  | -- weaviate       |              |
|  | -- grafana        |  ---------------------              |
|  ---------------------                                     |
|                         ---------------------              |
|                         | traefik-network   |              |
|                         | -- traefik        ||- connects   |
|                         ---------------------    to all    |
--------------------------------------------------------------

Service Details

Agent (Chainlit)

The conversational interface that implements RAG:

Features: Adjustable chunk retrieval (1-20), certainty thresholds, conversation memory
Observability: Full OpenTelemetry integration (traces, logs, metrics)
Storage: SQLite for conversation history and analytics

Monitor (Observability Stack)

Complete three-pillars observability:

Component	Purpose	Port
OpenTelemetry Collector	Central telemetry hub	4317/4318
Prometheus	Metrics storage	9090
Loki	Log aggregation	3100
Tempo	Distributed tracing	3200
Grafana	Unified visualization	3000

Integrations (Apache Airflow)

ETL pipelines for knowledge base management:

DAGs:

weaviate_rag_kb_azuredevops_extract: Syncs from Azure DevOps every 3 hours
weaviate_rag_kb_ingest: Processes and ingests documents every 4 hours

Supported Formats: Markdown, PDF, TXT

Chunking Strategy:

Markdown: Header-aware splitting preserving structure
Others: Recursive character splitting (1000 chars, 200 overlap)

Vector Database (Weaviate)

Semantic search engine with OpenAI embeddings:

Modules: text2vec-openai, backup-filesystem, qna-openai
Auth: API key-based (readonlykey, adminkey)
Resources: 4 CPU cores, 6GB memory limit

Traefik (Reverse Proxy)

Cloud-native reverse proxy:

Host-based routing for all services
Automatic service discovery via Docker
Dashboard on port 8888

Ports Summary

Port	Service	Protocol
80	Traefik HTTP	HTTP
443	Traefik HTTPS	HTTPS
3000	Grafana	HTTP
3100	Loki	HTTP
3200	Tempo	HTTP
4317	OTel Collector	gRPC
4318	OTel Collector	HTTP
8000	Agent	HTTP
8080	Airflow	HTTP
8081	Weaviate	HTTP
8888	Traefik Dashboard	HTTP
9090	Prometheus	HTTP
50051	Weaviate	gRPC

Contributing

Fork the repository
Create a feature branch
Make your changes
Test with docker compose up --build
Submit a pull request

License

MIT License

Support

Issues: https://github.com/lorenzouriel/ask-junior/issues
Documentation: See individual service README files

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
agent		agent
docs		docs
integrations		integrations
monitor		monitor
traefik		traefik
vector_database		vector_database
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Ask Junior

Services

Data Flow

Quick Start

Prerequisites

1. Clone the Repository

2. Start Services (Use the guide inside each service)

3. Access Services

Network Architecture

Service Details

Agent (Chainlit)

Monitor (Observability Stack)

Integrations (Apache Airflow)

Vector Database (Weaviate)

Traefik (Reverse Proxy)

Ports Summary

Contributing

License

Support

About

Uh oh!

Releases 1

Packages

Languages

License

lorenzouriel/ask-junior

Folders and files

Latest commit

History

Repository files navigation

Ask Junior

Services

Data Flow

Quick Start

Prerequisites

1. Clone the Repository

2. Start Services (Use the guide inside each service)

3. Access Services

Network Architecture

Service Details

Agent (Chainlit)

Monitor (Observability Stack)

Integrations (Apache Airflow)

Vector Database (Weaviate)

Traefik (Reverse Proxy)

Ports Summary

Contributing

License

Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages