Skip to content

antonyjoseph2111/Paper2Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Paper2Project

Paper2Project converts a machine learning research paper PDF into a structured, editable, reproducible starter project with PyTorch code and a Google Colab notebook.

It is built for the real paper-to-implementation workflow:

upload paper -> inspect extracted pipeline -> edit decisions -> generate code -> run notebook

UI Preview

The app now includes a simple workflow UI for uploads, progress tracking, agent trace inspection, decision editing, and artifact downloads.

Paper2Project dashboard

Paper2Project workflow artifacts

Paper2Project agent trace

What It Does

Given a research paper PDF, Paper2Project can:

  1. Parse and clean the paper
  2. Extract a structured ML understanding
  3. Build an executable pipeline plan
  4. Expose a human-editable decision layer
  5. Generate modular project files
  6. Generate a Colab notebook
  7. Package the artifacts for download

Generated artifacts include:

  • model.py
  • data_loader.py
  • train.py
  • config.yaml
  • requirements.txt
  • paper2project_notebook.ipynb

Why This Project Exists

Research papers are rarely implementation-ready. They often leave gaps around preprocessing, hyperparameters, dataset availability, evaluation details, or engineering structure.

Paper2Project is designed to bridge that gap by making every stage explicit:

  • what the parser extracted
  • what the analyst inferred
  • what the planner assumed
  • what the user changed
  • what the generator produced

Core Design Principles

  • Multi-agent instead of one giant LLM call
  • JSON contracts between stages
  • Human-in-the-loop before generation
  • Reproducibility over magic
  • Graceful degradation when inputs are messy
  • Baseline code that runs and can be extended

Workflow

flowchart TD
    A["Upload Paper PDF"] --> B["PDF Parsing + Cleaning"]
    B --> C["Paper Analyst Agent"]
    C --> D["Analysis JSON"]
    D --> E["Planner Agent"]
    E --> F["Pipeline Plan JSON"]
    F --> G["Decision Agent"]
    G --> H["Editable Decision Config"]
    H --> I["User Approval"]
    I --> J["Code Generator Agent"]
    J --> K["Project Files"]
    K --> L["Notebook Builder Agent"]
    L --> M["Colab Notebook + Artifact Bundle"]
Loading

UI Workflow

The built-in UI lets a user:

  • upload a PDF directly from the browser
  • see recent jobs and live status changes
  • inspect parsed summaries, analysis, and pipeline plans
  • review agent prompts, payloads, and responses
  • edit training and model decisions
  • approve generation
  • download the final artifact bundle

The UI is served directly by FastAPI at /.

Agents

Paper Analyst Agent

Extracts:

  • task
  • domain
  • input/output format
  • model family
  • components
  • loss
  • metrics
  • training details

Planner Agent

Converts analysis into:

  • execution steps
  • dataset requirements
  • model structure
  • hyperparameters
  • assumptions
  • open questions

Decision Agent

Exposes user-editable controls such as:

  • dataset
  • model
  • optimizer
  • scheduler
  • loss
  • epochs
  • batch size
  • learning rate
  • seed

Code Generator Agent

Builds a runnable baseline project from the approved config.

Notebook Builder Agent

Builds a Colab notebook that reconstructs the generated files and runs training.

Current Capabilities

Parsing and enrichment

  • PDF parsing with PyMuPDF
  • Section chunking for downstream LLM calls
  • Heuristic equation extraction
  • Optional Grobid TEI ingestion
  • Optional arXiv source download and LaTeX text enrichment

LLM execution

  • Multi-provider LLM client
  • Shared agent memory across stages
  • Configurable provider roster
  • Configurable provider strategy:
    • fallback_chain
    • first_success
    • ensemble
  • Retry and backoff support
  • Heuristic fallback when LLM output is unavailable or invalid

Workflow and backend

  • FastAPI backend
  • Background job execution with ThreadPoolExecutor
  • Persistent JSON job store
  • Artifact metadata and zip download endpoint
  • API key protection
  • CORS support

Generation

  • Config-driven training entrypoint
  • Multi-domain baseline generation for:
    • NLP classification
    • NLP generation
    • CV classification
    • CV segmentation
    • tabular classification/regression
    • RL with a DQN baseline
  • TensorBoard and optional W&B hooks

Supported LLM Providers

The current provider layer supports:

  • OpenAI
  • Anthropic
  • Google Gemini
  • OpenRouter
  • DeepSeek
  • Groq
  • Together
  • xAI
  • Ollama

Configuration is environment-driven with P2P_-prefixed settings, and .env loading is supported.

Project Structure

pro4/
|-- app/
|   |-- agents/
|   |-- api/
|   |-- core/
|   |-- models/
|   |-- orchestration/
|   |-- prompts/
|   |-- services/
|   `-- web/
|-- docs/
|   `-- images/
|-- examples/
|-- tests/
|-- pyproject.toml
`-- README.md

Key Files

  • app/main.py
  • app/orchestration/workflow.py
  • app/models/schemas.py
  • app/services/llm_client.py
  • app/services/pdf_parser.py
  • app/services/code_generator.py
  • app/services/notebook_builder.py
  • app/web/index.html
  • tests

API Overview

POST /jobs

Upload a PDF and create a background job.

GET /jobs

List recent jobs for the UI.

GET /jobs/{job_id}

Fetch job state and outputs.

GET /jobs/{job_id}/trace

Fetch the agent trace shown in the UI.

GET /jobs/{job_id}/decision

Fetch the editable decision config.

PATCH /jobs/{job_id}/decision

Update the decision config before generation.

POST /jobs/{job_id}/approve

Start project and notebook generation.

GET /jobs/{job_id}/artifacts

Fetch the artifact manifest.

GET /jobs/{job_id}/artifacts/download

Download the generated zip bundle.

Quick Start

python -m venv .venv
.venv\Scripts\activate
pip install -e .
uvicorn app.main:app --reload

Open:

http://127.0.0.1:8000

Health check:

curl http://127.0.0.1:8000/health

Environment Configuration

Examples:

P2P_OPENAI_API_KEY=...
P2P_ANTHROPIC_API_KEY=...
P2P_GOOGLE_API_KEY=...
P2P_LLM_ROSTER=openai:gpt-4.1-mini,anthropic:claude-3-5-sonnet-latest,google:gemini-2.5-flash
P2P_LLM_STRATEGY=fallback_chain
P2P_REQUIRE_API_KEY=false

Reproducibility Features

  • fixed seed in generated configs
  • explicit assumptions
  • config-driven training
  • modular output files
  • editable decision layer
  • artifact packaging
  • notebook reconstruction from generated files

Current Status

Paper2Project is now beyond a pure scaffold. The repository includes:

  • provider-backed LLM orchestration
  • persistence for job state
  • background execution
  • config-driven code generation
  • notebook generation
  • download endpoints
  • authentication and CORS
  • a workflow UI
  • tests for core utilities

It is still a baseline-oriented system, not a perfect paper reproduction engine. That tradeoff is intentional.

Tests

The repository includes a real starter test suite in tests covering:

  • job store persistence
  • PDF parsing basics
  • nested heading handling
  • dataset mapper behavior
  • generated config structure
  • memory filtering and truncation

Contributors

Paper2Project is authored and directed by Antony Joseph.

AI-assisted contribution and development support:

  • OpenAI Codex
  • Claude

Documentation

  • Architecture
  • Implementation Plan
  • Example Parsed Paper
  • Example Analysis
  • Example Pipeline Plan
  • Example Decision Config

Vision

The target workflow is straightforward:

Upload a paper -> inspect the extracted pipeline -> edit decisions -> generate code -> run the notebook.

That is the workflow this repository is built around.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors