Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 18 additions & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,4 +45,21 @@ jobs:
run: poetry install

- name: Run tests
run: poetry run pytest tests/ -v --cov=pygaborstm --cov-report=term-missing
run: poetry run pytest tests/ -v -m "not gpu" --cov=pygaborstm --cov-report=term-missing

docs:
name: Docs build
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Install Python
uses: actions/setup-python@v5
with:
python-version: "3.12"

- name: Install doc requirements
run: pip install -r docs/requirements.txt

- name: Build docs
run: mkdocs build --strict
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -45,4 +45,8 @@ ENV/
logs/
outputs/
checkpoints/
*.log
*.log

images/

site/
21 changes: 21 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Read the Docs build configuration.
# See https://docs.readthedocs.io/en/stable/config-file/v2.html
#
# Uses pip (not poetry) because RTD's poetry support is fragile and
# pyproject.toml here pins cupy-cuda13x, which can't install on RTD.
# docs/requirements.txt mirrors the pieces docs actually need.

version: 2

build:
os: ubuntu-22.04
tools:
python: "3.12"

mkdocs:
configuration: mkdocs.yml
fail_on_warning: true

python:
install:
- requirements: docs/requirements.txt
23 changes: 15 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,8 @@ PyGaborSTM is a Python library for extracting Rate-Scale-Frequency (RSF) represe

## Installation

<!-- TODO: Publish to PyPI -->
```bash
pip install pygaborstm # not published yet
pip install pygaborstm
```
For now, install from source (see below).

Expand Down Expand Up @@ -63,9 +62,9 @@ spec = model.spectrogram(audio)
rsf = model.rsf(spec)

# Visualization
stm.plot.spectrogram(spec)
stm.plot.rsf(rsf)
stm.plot.rsf(rsf, fold=True) # Symmetric folding
stm.plot.plt_spectrogram(spec)
stm.plot.plt_rsf(rsf)
stm.plot.plt_rsf(rsf, fold=True) # Symmetric folding
```

See `notebooks/example_usage.ipynb` for more examples.
Expand All @@ -83,7 +82,7 @@ config = stm.Config(
octaves=5.3, # Frequency range in octaves

# RSF / Gabor
resolution="low", # "low", "medium", "high", "ultra"
resolution="low", # "low", "medium", "high", "ultra", "max", "overkill"
)
```

Expand All @@ -98,7 +97,9 @@ PyGaborSTM/
│ ├── gabor.py # GaborFilterbank
│ ├── core.py # PyGaborSTM class
│ ├── plot.py # Plotting functions
│ └── backend.py # NumPy/CuPy switching
│ ├── analysis.py # MTF analysis helpers
│ ├── backend.py # NumPy/CuPy switching
│ └── gammatone_kernel.py # Custom CUDA SOS kernel
├── notebooks/
└── tests/
```
Expand All @@ -107,11 +108,17 @@ PyGaborSTM/
```bash
poetry install # Install all dependencies
poetry run jupyter notebook # Run notebooks
poetry run pytest -v # Run tests
poetry run pytest -m "not gpu" # Run all tests excluding GPU kernel tests (used in CI/CD)
poetry run pytest -v # Run all tests including GPU kernel tests
poetry run ruff check --fix . # lint and fix
poetry run ruff format . # format code
```

### Serve Docs locally
```bash
poetry run mkdocs serve
```

Note: Please lint and format before pushing, as CI will fail otherwise.

### Jupyter Kernel
Expand Down
3 changes: 3 additions & 0 deletions docs/api/analysis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Analysis

::: pygaborstm.analysis
3 changes: 3 additions & 0 deletions docs/api/backend.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Backend helpers

::: pygaborstm.backend
3 changes: 3 additions & 0 deletions docs/api/config.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Config

::: pygaborstm.config
3 changes: 3 additions & 0 deletions docs/api/core.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# PyGaborSTM

::: pygaborstm.core
3 changes: 3 additions & 0 deletions docs/api/gabor.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# GaborFilterbank

::: pygaborstm.gabor
3 changes: 3 additions & 0 deletions docs/api/gammatone_kernel.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# CUDA kernel

::: pygaborstm.gammatone_kernel
24 changes: 24 additions & 0 deletions docs/api/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# API Reference

Auto-generated from the source docstrings.

## Top-level

- [PyGaborSTM](core.md) — main user-facing class.
- [Config](config.md) — configuration dataclass.
- [Spectrogram, RSF](structs.md) — output data structures.

## Pipeline stages

- [AuditorySpectrogram](spectrogram.md) — cochlear-model spectrogram (stage 1).
- [GaborFilterbank](gabor.md) — 2D Gabor filterbank + RSF extraction (stage 2).

## Visualization & analysis

- [Plotting](plot.md) — `matplotlib` helpers for spectrograms and RSFs.
- [Analysis](analysis.md) — matched-filter MTF computation.

## Internals

- [Backend helpers](backend.md) — NumPy/CuPy switching, memory probe, dtype pairs.
- [CUDA kernel](gammatone_kernel.md) — custom batched-SOS kernel for the y1 stage.
3 changes: 3 additions & 0 deletions docs/api/plot.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Plotting

::: pygaborstm.plot
3 changes: 3 additions & 0 deletions docs/api/spectrogram.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# AuditorySpectrogram

::: pygaborstm.spectrogram
3 changes: 3 additions & 0 deletions docs/api/structs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Data structures

::: pygaborstm.structs
74 changes: 74 additions & 0 deletions docs/getting-started.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# Getting Started

## Install

PyGaborSTM is not yet on PyPI. Install from source:

```bash
git clone https://github.com/JHU-LCAP/PyGaborSTM.git
cd PyGaborSTM
poetry install
```

### GPU support (optional)

GPU acceleration uses CuPy. On Linux/Windows with a CUDA-capable NVIDIA
GPU and the matching CUDA toolkit installed, `poetry install` will pull
the correct `cupy-cudaXXx` wheel. The library automatically falls back
to NumPy with a `UserWarning` if CuPy is unavailable.

```bash
# Check your CUDA version
nvidia-smi
```

| CUDA version | CuPy wheel |
|--------------|------------------|
| 11.x | `cupy-cuda11x` |
| 12.x | `cupy-cuda12x` |
| 13.x | `cupy-cuda13x` |

## Quick start

```python
import pygaborstm as stm

# CPU
model = stm.PyGaborSTM()

# GPU (or fall back to CPU with a warning)
model = stm.PyGaborSTM(config=stm.Config(use_gpu=True))

# Two-stage usage
spec = model.spectrogram(audio) # Spectrogram dataclass
rsf = model.rsf(spec) # RSF dataclass

# Or chain both stages on device with no intermediate host copy
rsf = model.compute(audio)

# Visualize
stm.plot.plt_spectrogram(spec)
stm.plot.plt_rsf(rsf)
stm.plot.plt_rsf(rsf, fold=True) # symmetric folding
```

See [`notebooks/example_usage.ipynb`](https://github.com/JHU-LCAP/PyGaborSTM/blob/main/notebooks/example_usage.ipynb)
in the repo for a worked example.

## Configuration

All pipeline parameters live in a single [`Config`](api/config.md) dataclass:

```python
config = stm.Config(
use_gpu=False, # GPU acceleration
sample_rate=16000, # audio sample rate
n_filters=128, # cochlear channels
f_min=180.0, # lowest filter center frequency (Hz)
octaves=5.3, # frequency range in octaves
resolution="low", # "low", "medium", "high", "ultra", "max", "overkill"
)
```

See the [Config reference](api/config.md) for the full list of fields and
their defaults.
33 changes: 33 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# PyGaborSTM

Python library for extracting **Rate-Scale-Frequency (RSF)** representations
from audio signals using bio-inspired auditory spectrograms and 2D Gabor
filterbanks, following Chi, Ru & Shamma (2005) and Bellur & Elhilali (2017).

## What it does

```text
audio ──▶ AuditorySpectrogram ──▶ GaborFilterbank ──▶ RSF
(n_freq × n_time) (n_frames × n_rates × n_scales × n_freq)
```

- **CPU or GPU**: drop-in NumPy/CuPy backend.
- **Memory-adaptive Gabor stage**: caches kernel FFTs when memory allows,
falls back to streaming otherwise.
- **Custom batched-SOS CUDA kernel** for the cochlear filter stage.

## Where to go next

- [Getting Started](getting-started.md) — install and run the pipeline on
one audio file.
- [API Reference](api/index.md) — generated from the source docstrings.
- [GitHub repository](https://github.com/JHU-LCAP/PyGaborSTM) — issues, PRs.

## References

- Bellur, A., & Elhilali, M. (2017). Feedback-driven sensory mapping
adaptation for robust speech activity detection. *IEEE/ACM Transactions
on Audio, Speech, and Language Processing*, 25(3), 481–492.
- Chi, T., Ru, P., & Shamma, S. A. (2005). Multiresolution spectrotemporal
analysis of complex sounds. *The Journal of the Acoustical Society of
America*, 118(2), 887–906.
12 changes: 12 additions & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Documentation build requirements (used by Read the Docs).
# Kept separate from pyproject.toml because RTD installs via pip, not poetry.
# Mirrors the `docs` dependency group plus the runtime deps that
# mkdocstrings needs to actually import the package.

mkdocs>=1.6,<2.0
mkdocs-material>=9.5,<10.0
mkdocstrings[python]>=1.0,<2.0

numpy>=2.3,<3.0
scipy>=1.16,<2.0
matplotlib>=3.10,<4.0
74 changes: 74 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
site_name: PyGaborSTM
site_description: Rate-Scale-Frequency (RSF) representations from audio via auditory spectrograms and 2D Gabor filterbanks.
site_url: https://pygaborstm.readthedocs.io/
repo_url: https://github.com/JHU-LCAP/PyGaborSTM
repo_name: JHU-LCAP/PyGaborSTM

theme:
name: material
features:
- navigation.sections
- navigation.indexes
- navigation.top
- content.code.copy
palette:
- media: "(prefers-color-scheme: light)"
scheme: default
primary: indigo
accent: indigo
toggle:
icon: material/weather-night
name: Switch to dark mode
- media: "(prefers-color-scheme: dark)"
scheme: slate
primary: indigo
accent: indigo
toggle:
icon: material/weather-sunny
name: Switch to light mode

plugins:
- search
- mkdocstrings:
default_handler: python
handlers:
python:
options:
docstring_style: numpy
show_source: false
show_root_heading: true
show_root_full_path: false
show_symbol_type_heading: true
show_symbol_type_toc: true
members_order: source
separate_signature: true
show_signature_annotations: true
signature_crossrefs: true
merge_init_into_class: true
docstring_section_style: spacy

markdown_extensions:
- admonition
- pymdownx.details
- pymdownx.superfences
- pymdownx.highlight:
anchor_linenums: true
- pymdownx.inlinehilite
- pymdownx.snippets
- toc:
permalink: true

nav:
- Home: index.md
- Getting Started: getting-started.md
- API Reference:
- api/index.md
- PyGaborSTM: api/core.md
- Config: api/config.md
- Spectrogram (stage): api/spectrogram.md
- GaborFilterbank (stage): api/gabor.md
- Data structures: api/structs.md
- Plotting: api/plot.md
- Analysis: api/analysis.md
- Backend helpers: api/backend.md
- CUDA kernel: api/gammatone_kernel.md
Loading
Loading