Skip to content

RamanRed/RAG-Gemini

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📚 PDF Question Answering using Gemini + HuggingFace + LangChain

This project is a simple yet powerful Q&A pipeline that allows you to query a PDF file and receive AI-generated answers using:

  • 💡 LangChain for document splitting, embeddings, and vector storage
  • 🧠 HuggingFace MiniLM for semantic similarity
  • 💃 ChromaDB as a persistent vector store
  • 🤖 Google Gemini API for generating natural language responses

🧠 What It Does

  1. Loads and reads your PDF file (e.g., How We Think.pdf)
  2. Splits the document into smaller chunks using RecursiveCharacterTextSplitter
  3. Embeds these chunks using HuggingFace all-MiniLM-L6-v2
  4. Stores the embeddings in a local Chroma vector database
  5. Takes your query, finds the most relevant chunks
  6. Passes the results to Google Gemini for answer generation

📁 Folder Structure

📆pdf-gemini-qa/
 ├ 📄 rag.py
 ├ 📄 requirements.txt (below you can see)
 ├ 📄 .env
 ├ 📄 How We Think.pdf
 ┗ 📁 chroma_db_nccn/

📆 Requirements & Setup

🐍 Python Version

  • Python 3.8 or higher

⚖️ Setup Instructions

# Clone the repo
git clone https://github.com/your-username/pdf-gemini-qa.git
cd pdf-gemini-qa

# Create a virtual environment
python -m venv venv
source venv/bin/activate   # On Windows use: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

📄 Create .env File

You need to add your Gemini API key to a .env file in the root folder:

GEMINI_API_KEY=your_actual_api_key_here

📚 Required Packages

Here’s what’s in your requirements.txt:

langchain
langchain-community
chromadb
huggingface-hub
sentence-transformers
python-dotenv
google-generativeai
PyPDF2

🕽️ Installation (Alternative Manual Way)

If you’re not using requirements.txt, you can install packages manually:

pip install langchain
pip install langchain-community
pip install chromadb
pip install huggingface-hub
pip install sentence-transformers
pip install python-dotenv
pip install google-generativeai
pip install PyPDF2

🛠 Usage

Run the script:

python main.py

You'll be prompted:

What is your Query:

Ask questions based on the content of your PDF!


✅ Example Query

What is your Query: What is reflective thinking according to the author?

🔐 Environment Variables


🧪 Sample Prompt Sent to Gemini

## Context:
<relevant document content>

## Query:
Why does the author emphasize reflective thinking?

## Instructions:
- Use ONLY the provided context to answer the query.
- If the context does not contain enough information, say "I don't have enough information."
- Provide factual and well-structured responses.

🙌 Acknowledgments

HuggingFace

LangChain

Google AI / Gemini

🙌 Acknowledgments

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages