Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
data/** filter=lfs diff=lfs merge=lfs -text
data/**/README.md !filter !diff !merge text
data/**/*.py !filter !diff !merge text
2 changes: 2 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ jobs:

steps:
- uses: actions/checkout@v4
with:
lfs: true

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
Expand Down
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -211,4 +211,5 @@ shine/_version.py
results/
examples/output/

project_plan/
project_plan/
dev
111 changes: 82 additions & 29 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

SHINE (SHear INference Environment) is a JAX-powered framework for probabilistic shear estimation in weak gravitational lensing. It treats shear measurement as a Bayesian inverse problem: generating forward models of the sky, convolving with instrument response, and comparing to observed data to infer posterior distributions of shear parameters.

**Status:** Early development / Alpha. The architectural design is complete (see `DESIGN.md`) but source code implementation has not yet begun — currently only `shine/__init__.py` exists as an empty module.
**Status:** Early development / Alpha. Core modules (`shine.config`, `shine.inference`) and the first instrument backend (`shine.euclid`) are implemented. See `DESIGN.md` for the full architecture.

**Organization:** CosmoStat Lab (CEA / CNRS)
**License:** MIT
Expand All @@ -14,16 +14,32 @@ SHINE (SHear INference Environment) is a JAX-powered framework for probabilistic
```
SHINE/
├── .github/workflows/ # CI/CD (Claude PR assistant + code review)
│ ├── claude.yml # Claude PR assistant triggered by @claude mentions
│ └── claude-code-review.yml # Automated code review on PRs
├── assets/
│ └── logo.png # Project logo
├── configs/
│ └── euclid_vis.yaml # Example Euclid VIS config
├── data/
│ └── EUC_VIS_SWL/ # Euclid VIS test data (Git LFS)
├── external/
│ └── GalSim/ # External GalSim dependency (placeholder)
├── notebooks/
│ └── euclid_vis_map.ipynb # MAP fitting demo notebook
├── scripts/
│ └── test_map.py # Standalone MAP test script
├── shine/ # Main Python package
│ └── __init__.py # Currently empty
│ ├── __init__.py
│ ├── config.py # Base inference configuration
│ ├── prior_utils.py # Shared prior-parsing (config → NumPyro sample sites)
│ ├── inference.py # Inference engine (MAP, NUTS)
│ └── euclid/ # Euclid VIS instrument backend
│ ├── config.py # Euclid-specific configuration
│ ├── data_loader.py # FITS data loading & source selection
│ ├── scene.py # Multi-exposure scene model (NumPyro)
│ └── plots.py # Diagnostic visualizations
├── tests/
│ └── test_euclid/ # Euclid module tests (15 tests)
├── CLAUDE.md # This file
├── DESIGN.md # Comprehensive architecture & design document
├── DESIGN.md # Architecture & design document
├── LICENSE # MIT License
├── README.md # Project overview and quick start
└── pyproject.toml # Build configuration
Expand All @@ -36,18 +52,27 @@ SHINE/
- **JAX-GalSim** — Differentiable galaxy profile rendering and PSF convolution
- **BlackJAX** — Optional lower-level inference library for custom samplers

## Planned Module Architecture
## Module Architecture

These modules are specified in `DESIGN.md` but not yet implemented:
| Module | Status | Purpose |
|--------|--------|---------|
| `shine.config` | Implemented | Configuration schema (galaxy model, inference, distributions with `center: catalog`) |
| `shine.prior_utils` | Implemented | Shared prior-parsing: converts `DistributionConfig` → NumPyro sample sites |
| `shine.inference` | Implemented | Inference engine (MAP optimization, NUTS via NumPyro) |
| `shine.euclid` | Implemented | Euclid VIS instrument backend: data loading, scene model, diagnostics |
| `shine.scene_modelling` | Planned | Generic NumPyro generative model definitions |
| `shine.simulations` | Planned | Additional survey interfaces (LSST, MeerKAT) |
| `shine.morphology` | Planned | Non-parametric galaxy profiles (VAE/Diffusion) |
| `shine.wms` | Planned | Workflow management for HPC/SLURM clusters |

| Module | Purpose |
|--------|---------|
| `shine.config` | YAML config parsing and validation (pydantic) |
| `shine.scene_modelling` | NumPyro generative model definitions |
| `shine.inference` | Bayesian inference (NUTS, SVI, BlackJAX) |
| `shine.simulations` | Survey-specific data interfaces (Euclid, LSST, MeerKAT) |
| `shine.morphology` | Galaxy surface brightness profiles (Sersic, VAE/GAN) |
| `shine.wms` | Workflow management for HPC/SLURM clusters |
### `shine.euclid` — Euclid VIS Backend

The first instrument backend, providing end-to-end shear inference on Euclid VIS quadrant-level data:

- **`config.py`** — Pydantic configuration: data paths, source selection (SNR, `det_quality_flag`, size filtering), galaxy model specification via shared `GalaxyConfig` (supports `center: catalog` priors), multi-tier stamp sizes
- **`data_loader.py`** — Reads quadrant FITS files (SCI/RMS/FLG), PSF grids with bilinear interpolation, background maps, MER catalogs; computes per-source WCS positions, Jacobians, PSF stamps, and visibility
- **`scene.py`** — NumPyro generative model: renders Sersic galaxies convolved with spatially-varying PSFs via JAX-GalSim; multi-tier stamp sizes (64/128/256 px) with separate `vmap` per tier; standalone `render_model_images()` for post-inference visualization
- **`plots.py`** — 3-panel diagnostic figures (observed | model | chi residual) with configurable masking

## Build System

Expand Down Expand Up @@ -75,21 +100,44 @@ When implementing code for this project, follow these conventions:
- JAX uses a **functional PRNG** — manage RNG keys carefully, especially within NumPyro models
- Support reparameterization (e.g., `LocScaleReparam`) for hierarchical models

## Testing Strategy
## Testing

The `tests/test_euclid/` directory contains 15 tests covering:

Not yet implemented. When tests are added, follow this strategy:
- **Unit tests:** PSF grid interpolation, exposure reading, WCS transforms, source selection (SNR, flag, size filters), ExposureSet assembly
- **Scene tests:** Single-exposure rendering, multi-exposure model, multi-tier stamp rendering, `render_model_images()`
- **Integration test:** End-to-end MAP inference on real Euclid VIS data

- **Unit tests:** Verify individual components (e.g., Sersic profile generation)
- **Integration tests:** End-to-end runs on small synthetic patches
- **Validation tests:**
- Self-consistency: generate data with known shear, infer it back, verify posterior credible intervals
- Comparison: compare with standard (non-JAX) GalSim for numerical accuracy
- **Run tests with:** `pytest` (when test infrastructure exists)
Run tests with:

```bash
pytest tests/test_euclid/ -q
```

Future testing should also include:
- Self-consistency: generate data with known shear, infer it back, verify posterior credible intervals
- Comparison: compare with standard (non-JAX) GalSim for numerical accuracy

## Configuration Pattern

SHINE uses GalSim-compatible YAML configuration with a probabilistic extension: any parameter defined as a distribution (e.g., `type: Normal`) becomes a **latent variable** for inference rather than a fixed simulation value. See `DESIGN.md` Section 6.1 for config examples.

Both the generic `SceneBuilder` and the Euclid `MultiExposureScene` read their probabilistic model from the same `GalaxyConfig` schema in the YAML `gal:` section. The shared `parse_prior()` function in `shine.prior_utils` converts each config entry into a NumPyro sample site. For catalog-centered priors (where the location parameter comes from per-source catalog data), use `center: catalog`:

```yaml
gal:
type: Exponential
flux: {type: LogNormal, center: catalog, sigma: 0.5} # median from catalog
half_light_radius: {type: LogNormal, center: catalog, sigma: 0.3}
shear:
g1: {type: Normal, mean: 0.0, sigma: 0.05}
g2: {type: Normal, mean: 0.0, sigma: 0.05}
position:
type: Offset
dx: {type: Normal, mean: 0.0, sigma: 0.05}
dy: {type: Normal, mean: 0.0, sigma: 0.05}
```

## Development Roadmap

1. **Phase 1:** Prototype with simple parametric models (Sersic) and constant PSF
Expand All @@ -111,19 +159,24 @@ The primary design document is `DESIGN.md` (343 lines). Consult it for:
# Install in development mode
pip install -e .

# Run tests (when available)
# Run all tests
pytest

# Format code (when tooling is configured)
# Run Euclid tests only
pytest tests/test_euclid/ -q

# Standalone MAP test on bundled data
python scripts/test_map.py

# Format code
black shine/
isort shine/
```

## Notes for AI Assistants

- Always consult `DESIGN.md` before implementing new modules — it contains detailed API specifications and code structure examples
- The project has no runtime dependencies listed in `pyproject.toml` yet; add JAX, NumPyro, JAX-GalSim, etc. when implementing modules
- No linter/formatter configuration files exist yet (no `.flake8`, `pyproject.toml [tool.black]`, etc.) — create them following the standards in DESIGN.md Section 4.1 when needed
- No test directory or test infrastructure exists yet — create `tests/` following pytest conventions when adding tests
- The `shine.euclid` package is the reference instrument backend — follow its patterns (config, data loader, scene model) when adding new instruments
- Test data for Euclid is stored in `data/EUC_VIS_SWL/` via Git LFS; FITS files are LFS-tracked, README.md and .py scripts are regular git
- The `external/GalSim/` directory is currently an empty placeholder
- CI/CD currently only includes Claude-based PR workflows; traditional CI (tests, linting) should be added when code exists to test
- CI/CD currently only includes Claude-based PR workflows; traditional CI (tests, linting) should be added
14 changes: 13 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,19 @@ cd SHINE
pip install -e ".[dev,test]"
```

## 🚀 Quick Start
## Euclid VIS Support

SHINE includes a full instrument backend for **Euclid VIS** multi-exposure shear inference (`shine.euclid`):

- Reads quadrant-level FITS exposures, PSF grids, background maps, and MER source catalogs
- Forward-models the scene as Sersic galaxies convolved with spatially-varying PSFs via JAX-GalSim
- Multi-tier stamp rendering (64/128/256 px) assigned per source by half-light radius
- Catalog filtering by SNR, `det_quality_flag`, size, and footprint visibility
- MAP and NUTS inference for shared shear (g1, g2) and per-source morphology

See `notebooks/euclid_vis_map.ipynb` for an interactive demo.

## Quick Start

### Run inference from a config file

Expand Down
70 changes: 70 additions & 0 deletions configs/euclid_vis.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Euclid VIS multi-exposure shear inference — NGC 6505 test data
#
# Points to the bundled test data in data/EUC_VIS_SWL/.
# Selects the 600 brightest sources and runs MAP inference for quick testing.
#
# The `gal:` section specifies the probabilistic model: galaxy profile type,
# which parameters are latent variables (distributions) vs fixed, and what
# priors are used. Parameters with `center: catalog` use per-source catalog
# values as the distribution location at runtime.

data:
exposure_paths:
- data/EUC_VIS_SWL/EUC_VIS_SWL-DET-002704-00-1-0000000__20241017T042756.626122Z_3-4-F.fits.gz
- data/EUC_VIS_SWL/EUC_VIS_SWL-DET-002704-01-1-0000000__20241017T042805.678587Z_3-4-F.fits.gz
- data/EUC_VIS_SWL/EUC_VIS_SWL-DET-002704-02-1-0000000__20241017T042754.590559Z_3-4-F.fits.gz
psf_path: data/EUC_VIS_SWL/PSF_3-4-F.fits.gz
catalog_path: data/EUC_VIS_SWL/catalogue_3-4-F.fits.gz
background_paths:
- data/EUC_VIS_SWL/EUC_VIS_SWL-BKG-002704-00-1-0000000__20241017T042756.626186Z_3-4-F.fits.gz
- data/EUC_VIS_SWL/EUC_VIS_SWL-BKG-002704-01-1-0000000__20241017T042805.678757Z_3-4-F.fits.gz
- data/EUC_VIS_SWL/EUC_VIS_SWL-BKG-002704-02-1-0000000__20241017T042754.590644Z_3-4-F.fits.gz
quadrant: "3-4.F"
pixel_scale: 0.1

sources:
min_snr: 10.0
require_vis_detected: true
exclude_spurious: true
exclude_deblended: false
exclude_point_sources: true
det_quality_exclude_mask: 0x78C
max_sources: 600

# Galaxy model specification — the probabilistic model is explicit here.
# Each parameter is either a fixed value or a distribution (= latent variable).
gal:
type: Exponential

shear:
type: G1G2
g1: {type: Normal, mean: 0.0, sigma: 0.05}
g2: {type: Normal, mean: 0.0, sigma: 0.05}

# Catalog-centered priors: center="catalog" means the LogNormal median
# is set to each source's catalog value at runtime.
flux: {type: LogNormal, center: catalog, sigma: 0.5}
half_light_radius: {type: LogNormal, center: catalog, sigma: 0.3}

ellipticity:
type: E1E2
e1: {type: Normal, mean: 0.0, sigma: 0.3}
e2: {type: Normal, mean: 0.0, sigma: 0.3}

# Position offsets from catalog positions (in arcsec)
position:
type: Offset
dx: {type: Normal, mean: 0.0, sigma: 0.05}
dy: {type: Normal, mean: 0.0, sigma: 0.05}

inference:
method: map
map_config:
enabled: true
num_steps: 200
learning_rate: 0.002
rng_seed: 42

galaxy_stamp_sizes: [64, 128, 256]
background: fixed
output_dir: results/euclid
Git LFS file not shown
Git LFS file not shown
Git LFS file not shown
Git LFS file not shown
Git LFS file not shown
Git LFS file not shown
3 changes: 3 additions & 0 deletions data/EUC_VIS_SWL/PSF_3-4-F.fits.gz
Git LFS file not shown
Loading