diff --git a/README.md b/README.md index 434333f..826cc34 100644 --- a/README.md +++ b/README.md @@ -38,6 +38,10 @@ After installation, the console scripts (`rate-extract`, `rate-evaluate`) are in python -m rate_eval.cli.evaluate [OPTIONS] ``` +## Tutorial + +For a step-by-step walkthrough of feature extraction and evaluation (including how to add a custom dataset), see [TUTORIAL.md](TUTORIAL.md). + ## Evaluate Pillar0 on Merlin Abdominal CT Dataset ```bash diff --git a/TUTORIAL.md b/TUTORIAL.md new file mode 100644 index 0000000..1fc17e7 --- /dev/null +++ b/TUTORIAL.md @@ -0,0 +1,277 @@ +# RATE-Evals Tutorial + +A step-by-step walkthrough of feature extraction and evaluation using the +RATE-Evals pipeline. By the end you will know how to: + +1. Extract vision embeddings from a model using the CLI +2. Use the Python API to extract features directly +3. Run disease-finding evaluation on those embeddings +4. Add a custom dataset of your own + +## Prerequisites + +- Python 3.8+ +- GPU recommended (CPU works but is slower) +- Install the project: + +```bash +uv sync +``` + +> **Note**: You do **not** need the `rad-vision-engine` (`rve`) package for +> this tutorial. The `DummyDataset` used below is self-contained. +> If you hit an `rve` import error elsewhere, see +> [Troubleshooting](#troubleshooting). + +## Setup + +Generate the synthetic image and labels used throughout the tutorial: + +```bash +python tutorial/setup_tutorial.py +``` + +This creates two files: + +| File | Purpose | +|------|---------| +| `assets/CXR145_IM-0290-1001.png` | 1024x1024 grayscale synthetic chest X-ray (required by `DummyDataset`) | +| `tutorial/dummy_labels.json` | Labels for 100 dummy studies with 3 binary findings | + +--- + +## Part 1: Extract Features (CLI) + +Use `rate-extract` to compute embeddings for the dummy dataset: + +```bash +# Extract training split +uv run rate-extract \ + --model pillar0 \ + --dataset dummy \ + --split train \ + --batch-size 4 \ + --output-dir cache/pillar0_dummy \ + --max-samples 100 +``` + +**Flags explained:** + +| Flag | Meaning | +|------|---------| +| `--model pillar0` | Use the Pillar-0 vision foundation model | +| `--dataset dummy` | Use the built-in `DummyDataset` (reads the synthetic image) | +| `--split train` | Process the training split | +| `--batch-size 4` | Images per GPU batch (lower this if you hit OOM) | +| `--output-dir cache/pillar0_dummy` | Where to save embedding `.npz` files | +| `--max-samples 100` | Limit to 100 samples (matches our labels) | + +Now extract the test split too: + +```bash +uv run rate-extract \ + --model pillar0 \ + --dataset dummy \ + --split test \ + --batch-size 4 \ + --output-dir cache/pillar0_dummy \ + --max-samples 100 +``` + +**Verify the output:** + +```bash +ls cache/pillar0_dummy/train/ +ls cache/pillar0_dummy/test/ +# You should see .npz files containing the cached embeddings. +``` + +--- + +## Part 2: Extract Features (Python API) + +For programmatic access you can call the model directly: + +```python +from rate_eval import create_model, create_dataset, setup_pipeline + +# Load configuration +config = setup_pipeline() + +# Create model and dataset +model = create_model("pillar0", config) +dataset = create_dataset("dummy", config, split="train") + +print(f"Dataset size : {len(dataset)}") +print(f"Accession [0]: {dataset.get_accession(0)}") + +# Get a single sample (two-view chest X-ray tensor) +sample = dataset[0] +print(f"Sample shape : {sample.shape}") +# -> torch.Size([1, 2, 1024, 1024]) (C, views, H, W) + +# Extract features (must pass modality matching the dataset) +features = model.extract_features(sample.unsqueeze(0), modality="chest_xray_two_view") +print(f"Embedding dim: {features.shape}") +``` + +This is useful when you want to integrate RATE-Evals into a larger pipeline or +inspect intermediate representations. It also shows how the model converts an +image into a fixed-size embedding vector. + +--- + +## Part 3: Run Evaluation + +Once you have cached embeddings for both `train` and `test` splits, run +evaluation: + +```bash +uv run rate-evaluate \ + --checkpoint-dir cache/pillar0_dummy \ + --dataset-name dummy \ + --labels-json tutorial/dummy_labels.json \ + --output-dir results/pillar0_dummy \ + evaluation.use_wandb=false +``` + +> `evaluation.use_wandb=false` is a Hydra-style override (no `--` prefix). +> It disables Weights & Biases logging so you can run offline. + +**What gets produced:** + +| File | Contents | +|------|----------| +| `results/pillar0_dummy/detailed_results.csv` | Per-question metrics (AUC, F1, accuracy, etc.) | +| `results/pillar0_dummy/summary_stats.json` | Aggregated metrics across all findings | +| `results/pillar0_dummy/training_stats.json` | Class distribution statistics from training | +| `results/pillar0_dummy/exam_probabilities.csv` | Per-exam predicted probabilities | + +**Inspect the results:** + +```python +import json, pandas as pd + +summary = json.load(open("results/pillar0_dummy/summary_stats.json")) +print(f"Average AUC: {summary['avg_auc']:.3f}") + +detailed = pd.read_csv("results/pillar0_dummy/detailed_results.csv") +print(detailed[["question", "auc", "f1", "num_positive", "num_negative"]]) +``` + +> **Note**: Because the dummy dataset repeats the same synthetic image, the +> embeddings carry no discriminative signal and metrics will be near-random. +> With real data, expect meaningful AUC scores. + +--- + +## Part 4: Add a Custom Dataset + +To evaluate on your own data, follow these four steps: + +### Step A: Create a Dataset Class + +Your class must implement the following interface: + +```python +class MyDataset: + def __init__(self, config, split="train", transforms=None, model_preprocess=None): + ... + + def __getitem__(self, idx): + """Return a tensor of shape (C, views, H, W) or (D, H, W) for 3-D volumes.""" + ... + + def __len__(self): + """Total number of samples.""" + ... + + def get_accession(self, idx): + """Return a unique string identifier (accession) for sample idx.""" + ... + + def get_all_accessions(self): + """Return a list of all accession strings (fast, no data loading).""" + ... +``` + +Place the file in `rate_eval/datasets/` and import it in +`rate_eval/datasets/__init__.py`. + +### Step B: Create a Config YAML + +Create `configs/dataset/my_dataset.yaml`: + +```yaml +# @package dataset +name: my_dataset +modality: chest_xray_two_view # or abdomen_ct, chest_ct, brain_ct, etc. + +data: + root_dir: /path/to/images + +processing: + target_size: [1024, 1024] + image_mode: L # L=grayscale, RGB=colour + +labels: + labels_json: /path/to/labels.json +``` + +### Step C: Register in Config + +Add an entry to `configs/config.yaml` under the `datasets:` section: + +```yaml +datasets: + # ... existing entries ... + my_dataset: + class: MyDataset + config: configs/dataset/my_dataset.yaml +``` + +### Step D: Prepare Labels JSON + +Create a JSON file where each key is an accession string and the value contains +`qa_results` with lists of question-answer dicts: + +```json +{ + "ACCESSION_001": { + "qa_results": { + "findings": [ + {"Is there evidence of cardiomegaly?": "yes"}, + {"Is there a pleural effusion?": "no"} + ] + } + } +} +``` + +Answers should be `"yes"` or `"no"` (case-insensitive). The evaluator +discovers all unique questions automatically. + +### Run It + +```bash +uv run rate-extract --model pillar0 --dataset my_dataset --all-splits \ + --batch-size 4 --output-dir cache/pillar0_my_dataset + +uv run rate-evaluate --checkpoint-dir cache/pillar0_my_dataset \ + --dataset-name my_dataset \ + --labels-json /path/to/labels.json \ + --output-dir results/pillar0_my_dataset \ + evaluation.use_wandb=false +``` + +--- + +## Troubleshooting + +| Problem | Solution | +|---------|----------| +| **`ModuleNotFoundError: rve`** | The `rve` (rad-vision-engine) package is only needed for RVE-format datasets. Use `DummyDataset` or NIfTI-based datasets instead, or install it: `git clone https://github.com/yalalab/rad-vision-engine ../rad-vision-engine && uv pip install -e ../rad-vision-engine` | +| **Out of memory (OOM)** | Reduce `--batch-size` (try 1 or 2) or use `--device cpu` | +| **NaN embeddings** | Add `--check-nan` to `rate-extract` for detailed diagnostics | +| **"Command not found"** | Add `~/.local/bin` to your `PATH` or use `uv run rate-extract` / `uv run rate-evaluate` | +| **HuggingFace auth errors** | Run `huggingface-cli login` for gated models (MedGemma, MedImageInsight) | diff --git a/tutorial/setup_tutorial.py b/tutorial/setup_tutorial.py new file mode 100644 index 0000000..feefe04 --- /dev/null +++ b/tutorial/setup_tutorial.py @@ -0,0 +1,93 @@ +"""Generate synthetic data for the RATE-Evals tutorial. + +Creates: + - assets/CXR145_IM-0290-1001.png Synthetic 1024x1024 grayscale chest X-ray + - tutorial/dummy_labels.json Labels for 100 dummy studies (3 binary findings) + +Dependencies: Pillow (already a project dependency) + stdlib. + +Usage: + python tutorial/setup_tutorial.py +""" + +import json +import os +import random + +from PIL import Image, ImageDraw, ImageFilter + +SEED = 42 +NUM_SAMPLES = 100 +POSITIVE_RATE = 0.20 # ~20 % positive rate per finding + +QUESTIONS = [ + "Is there evidence of cardiomegaly?", + "Is there a pleural effusion?", + "Is there lung consolidation?", +] + +ASSETS_DIR = os.path.join(os.path.dirname(__file__), "..", "assets") +LABELS_PATH = os.path.join(os.path.dirname(__file__), "dummy_labels.json") + + +def _create_synthetic_cxr(path: str, size: int = 1024) -> None: + """Create a synthetic grayscale image that loosely resembles a chest X-ray.""" + random.seed(SEED) + img = Image.new("L", (size, size), color=30) + draw = ImageDraw.Draw(img) + + # Dark oval for the thoracic cavity + draw.ellipse( + [size // 6, size // 8, size - size // 6, size - size // 10], + fill=50, + ) + + # Lighter region in the centre (mediastinum) + cx, cy = size // 2, size // 2 + draw.ellipse( + [cx - size // 8, cy - size // 4, cx + size // 8, cy + size // 4], + fill=80, + ) + + # Random speckle to give texture + for _ in range(2000): + x = random.randint(0, size - 1) + y = random.randint(0, size - 1) + v = random.randint(20, 90) + draw.point((x, y), fill=v) + + img = img.filter(ImageFilter.GaussianBlur(radius=3)) + os.makedirs(os.path.dirname(path), exist_ok=True) + img.save(path) + print(f"Created synthetic image: {path}") + + +def _create_dummy_labels(path: str) -> None: + """Create labels JSON in qa_results format for dummy_study_0000..0099.""" + random.seed(SEED) + labels = {} + + for idx in range(NUM_SAMPLES): + accession = f"dummy_study_{idx:04d}" + findings = [] + for q in QUESTIONS: + answer = "yes" if random.random() < POSITIVE_RATE else "no" + findings.append({q: answer}) + + labels[accession] = {"qa_results": {"findings": findings}} + + os.makedirs(os.path.dirname(path), exist_ok=True) + with open(path, "w") as f: + json.dump(labels, f, indent=2) + print(f"Created labels for {NUM_SAMPLES} studies: {path}") + + +def main() -> None: + image_path = os.path.join(ASSETS_DIR, "CXR145_IM-0290-1001.png") + _create_synthetic_cxr(image_path) + _create_dummy_labels(LABELS_PATH) + print("\nSetup complete! You can now follow TUTORIAL.md.") + + +if __name__ == "__main__": + main()