A flexible and extensible graph-based retrieval system built with Python's Abstract Base Classes, providing a clean interface-driven architecture similar to Go.
- Python 3.13+
- Virtuoso server and Freebase setup according to GrailQA.
# Clone the repository
git clone <repository-url>
cd graph-retriever
# Install dependencies
pip install -r requirements.txtRun a simple experiment with default settings:
python run.pyThis will run a cosine similarity retriever on a simple graph with synthetic data.
python run.py --graph <graph_type> --retriever <retriever_type> --dataset <dataset_type> --k <num_results>Available Options:
--graph: Graph implementation (freebase)--retriever: Retrieval model (gemini_baseline_retriever)--dataset: Dataset to use (synthetic)--experiment: Experiment type (standard)
# Use simple graph with cosine retriever
python run.py --graph simple --retriever cosine --k 5
We welcome contributions! This project is designed to be easily extensible through its interface-based architecture.
graph-retriever/
├── run.py # Main entry point
├── test_grailqa.py # Test script for GrailQA
├── sample_dataset.json # Sample dataset file
├── requirements.txt # Python dependencies
├── README.md # Project documentation
├── .gitignore # Git ignore rules
├── .env.example # Environment variables example
├── .env # Environment variables (local)
├── data/ # Data directory
├── dataset/ # Dataset module
│ ├── __init__.py # Dataset abstract base class
│ ├── create_dataset.py # Dataset factory function
│ └── custom_dataset.py # Custom dataset implementation
├── db/ # Database directory
├── experiment/ # Experiment module
│ ├── __init__.py # Experiment abstract base class
│ ├── create_experiment.py # Experiment factory function
│ └── custom_experiment.py # Custom experiment implementation
├── graph/ # Graph module
│ ├── __init__.py # Graph abstract base class
│ ├── create_graph.py # Graph factory function
│ └── custom_graph.py # Custom graph implementation
├── ontology/ # Ontology directory
├── output/ # Output directory
├── retriever/ # Retriever module
│ ├── __init__.py # Retriever abstract base class
│ ├── create_retriever.py # Retriever factory function
│ └── custom_retriever.py # Custom retriever implementation
└── utils/ # Utils directory
If you want to add a new Dataset, Experiment, Graph, or a new Retriever. Just create a new file in the relevant directory. E.g. For adding a new experiment called MyCustomExperiment, create a new file in experiment/my_custom_experiment.py. Import the base class Experiment as shown below and implement the base functions.
class MyCustomExperiment(Experiment):
"""Custom experiment implementation"""
@abstractmethod
def run(self,graph: Graph, retriever: Retriever, dataset: Dataset) -> Dict[str, float]:
"""Evaluate predictions against ground truth"""
pass
@abstractmethod
def dump_results(self, results: Result, output_path: str) -> None:
"""Dump experiment results to a file"""
pass
@abstractmethod
def __str__(self) -> str:
passAfter you have implemented the new function, make sure to register it in experiment/create_experiment.py. Finally add the name of your experiment in the parser choices in run.py.
- Follow the Interface: Always implement all abstract methods defined in the base classes
- Type Hints: Use type hints for all method signatures
- Documentation: Add docstrings explaining your implementation
- Testing: Test your implementation with the synthetic dataset before submitting
- Code Style: Follow PEP 8 style guidelines
- Error Handling: Add appropriate error handling for edge cases
- Commit conventions: Use commit conventions described here.
- Fork the repository
- Create a feature branch (
git checkout -b feature/my-new-feature) - Implement your changes following the guidelines above
- Test your implementation thoroughly
- Commit your changes (
git commit -am 'Add new feature') - Push to the branch (
git push origin feature/my-new-feature) - Create a Pull Request with a clear description of your changes