This repository contains code and notebooks for detecting and explaining hallucinations produced by large language models (LLMs). It includes data processing, model training, evaluation, explainability and a Wikipedia-based verification component.
- README.md — this file
- requirements.txt — pip installable dependencies
- environment.yml — conda environment specification
- data/
- raw/ — original/raw CSV datasets (halueval_qa.csv, truthfulqa.csv)
- processed/ — processed train/val/test CSVs
- models/
- best_model.pt — trained model checkpoint
- metrics.json — evaluation metrics
- training_history.json — training logs/history
- notebooks/
- 01_data_exploration.ipynb — EDA and dataset inspection
- 02_model_training.ipynb — model training and experiments
- 03_explainability.ipynb — explainability analyses and visualizations
- results/ — (placeholder for generated outputs / figures)
- src/ — core Python modules
- data_loader.py — data loading utilities
- data_processor.py — preprocessing and feature engineering
- dataset.py — dataset classes / PyTorch Dataset wrappers
- model.py — model definition and helpers
- trainer.py — training loop and checkpoints
- predictor.py — inference utilities
- evaluator.py — evaluation metrics and routines
- explainer.py — explainability methods
- wikipedia_verifier.py — verification helpers using Wikipedia
- init.py
- Git
- Python 3.11 (recommended to match environment.yml)
- Conda or a virtualenv for pip installs
- CPU is sufficient, GPU can be used if available for faster training
Using pip + venv
- Create and activate a virtual environment:
- python -m venv .venv
- source .venv/bin/activate (Linux/macOS) or .venv\Scripts\activate (Windows)
- Install dependencies:
- pip install --upgrade pip
- pip install -r requirements.txt
Using conda
- Create the conda environment from the provided file:
- conda env create -f environment.yml
- Activate it:
- conda activate llm-halu-env
Notes
- If using conda, the pip section in environment.yml already installs additional pip-only packages.
- For spaCy usage, run: python -m spacy download en_core_web_sm (if not already installed by environment).
Important packages used in the project include (non-exhaustive):
- PyTorch
- transformers, tokenizers
- datasets (Hugging Face)
- sentence-transformers
- scikit-learn, numpy, pandas
- lime(explainability)
- spaCy (nlp preprocessing)
- wikipedia, wikipedia-api (verification utilities)
See requirements.txt and environment.yml for full dependency lists.
- Data preparation
- Use src/data_loader.py and src/data_processor.py to load and preprocess raw data.
- Processed splits are stored under data/processed (train/val/test).
- Dataset & dataloaders
- src/dataset.py provides dataset wrappers for training and evaluation.
- Modeling & training
- src/model.py defines the model architecture.
- src/trainer.py contains the training loop, checkpointing and logging.
- Notebooks/02_model_training.ipynb demonstrates experiment steps.
- Inference & evaluation
- src/predictor.py performs inference on new inputs.
- src/evaluator.py computes metrics and generates evaluation reports.
- Explainability
- src/explainer.py and Notebooks/03_explainability.ipynb explore model explanations (LIME).
- Verification
- src/wikipedia_verifier.py provides utilities to check factual claims against Wikipedia.
- Trained model artifacts are under models/ (best_model.pt, metrics.json, training_history.json).
- The results/ directory is reserved for evaluation outputs, visualizations and exported reports produced by notebooks or scripts.
- Notebooks under notebooks/ produce reproducible EDA, training runs and explainability outputs — export their figures into results/ as needed.