Skip to content

Split pipeline.images_to_pics and cli.main into smaller helpers (#12)#24

Merged
rfrenchseti merged 6 commits into
mainfrom
rf_260512
May 13, 2026
Merged

Split pipeline.images_to_pics and cli.main into smaller helpers (#12)#24
rfrenchseti merged 6 commits into
mainfrom
rf_260512

Conversation

@rfrenchseti
Copy link
Copy Markdown
Collaborator

@rfrenchseti rfrenchseti commented May 13, 2026

Summary

Fixes #12. Splits the two largest functions in the repo into focused helpers so each phase of the pipeline has a clear name, a clear contract, and direct unit-test coverage.

  • src/picmaker/pipeline.py:
    • _pds3_resolve_pointer — the PDS3 detached-label ^pointer resolution that produces (imagefile, filter_info).
    • _hst_mosaic_rgb — the per-detector slice → colormap → mosaic flow for HST ACS/WFC and WFPC2, with _hst_wfpc2_mosaic and _hst_acs_panel_mosaic as geometry sub-helpers.
    • _process_one_image — the per-file get_outfile → read → slice → colormap → orient → resize → write chain.
    • images_to_pics body collapses to a flat loop that builds a PicmakerOptions, backfills the legacy None-means-default kwargs, and delegates each filename to _process_one_image.
  • src/picmaker/cli.py:
    • _collect_option_dicts — the --versions FILE re-parse loop that produces one normalized option_dict per non-blank line (main CLI's --replace / --proceed always win).
    • _process_directory — the per-directory walk, recursive or not.
    • main collapses to the thin top-level orchestrator that handles KeyboardInterrupt / SystemExit / Exception.

Acceptance criteria from #12:

  • images_to_pics body is < 100 lines (now ~84).
  • cli.main body is < 100 lines (now ~67).
  • New private helpers have docstrings and direct unit tests (HST mosaic geometry, PDS3 pointer resolution, versions-loop parsing, per-directory walk).
  • Ruff per-file ignores — there were none active on pipeline.py / cli.py (the RUF005 / A002 / N806 ignores the issue mentioned had already been removed); no new per-file ignores added.
  • Existing test suite passes unchanged (504 tests passing, 92.94% coverage).

Documentation updates:

  • docs/dev/pipeline.rst — flowchart and prose updated to name the new helpers.
  • docs/dev/module_layout.rst — module summary updated.
  • docs/module.rst — new private helpers added to the Sphinx :private-members: list so they render in the API reference.

Test plan

  • pytest — 504 passed, coverage 92.94%
  • ruff check src/ tests/ — clean
  • mypy src/picmaker/ — clean (one new assert options.extension is not None to thread str | None through get_outfile)
  • bash scripts/run-all-checks.sh -c — all code checks pass
  • bash scripts/run-all-checks.sh -d — Sphinx build clean, PyMarkdown clean
  • pyroma — 10/10
  • Snapshot tests pass unchanged (no fixture regeneration needed)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced VICAR file format parsing to accept files with minor structural variations, improving compatibility with diverse data sources
  • Tests

    • Added extensive unit tests for CLI argument processing and image pipeline operations to improve system reliability and robustness

Review Change Stack

rfrenchseti and others added 2 commits May 13, 2026 10:20
Extract the per-file body of `images_to_pics` into `_process_one_image`,
the inline PDS3 detached-label pointer-resolution block into
`_pds3_resolve_pointer`, and the HST ACS/WFC + WFPC2 mosaic-assembly
block into `_hst_mosaic_rgb` (with `_hst_wfpc2_mosaic` and
`_hst_acs_panel_mosaic` as geometry sub-helpers). Extract the
`--versions` re-parse loop and the per-directory walk from `cli.main`
into `_collect_option_dicts` and `_process_directory`.

Each new helper gets direct unit-test coverage in
`tests/test_pipeline_helpers.py` and `tests/test_cli_helpers.py`, and
the developer-guide pipeline / module-layout docs and the module API
reference are updated to point at the new helpers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 13, 2026

Warning

Rate limit exceeded

@rfrenchseti has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes and 55 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c52fd510-d209-4f42-86fd-350ae5e6d76d

📥 Commits

Reviewing files that changed from the base of the PR and between 67af22b and ebe924d.

📒 Files selected for processing (6)
  • src/picmaker/cli.py
  • src/picmaker/pipeline.py
  • tests/conftest.py
  • tests/test_cli_helpers.py
  • tests/test_pipeline_branches.py
  • tests/test_pipeline_helpers.py

Walkthrough

This PR refactors two large functions in the codebase—pipeline.images_to_pics and cli.main—by extracting smaller, testable helper functions. New pipeline helpers handle PDS3 pointer resolution and HST mosaic assembly; new CLI helpers handle versions-file parsing and directory processing. The refactor reduces module complexity while maintaining backward compatibility and adding comprehensive unit test coverage.

Changes

Pipeline and CLI refactoring with helper extraction

Layer / File(s) Summary
HST and PDS3 pipeline helper functions
src/picmaker/pipeline.py, tests/test_pipeline_helpers.py
Four new pipeline helpers: _pds3_resolve_pointer for PDS3 label pointer resolution, _hst_wfpc2_mosaic and _hst_acs_panel_mosaic for quad/panel assembly with filename-order inference, and _hst_mosaic_rgb orchestrating HST mosaic construction with per-band stretch/colormap and optional flipping. Module docstring updated. Comprehensive test coverage for all cases including error handling.
Per-image processing orchestration
src/picmaker/pipeline.py, tests/test_pipeline_helpers.py
_process_one_image encapsulates per-file workflow: output path computation, conditional skip on replace='none', PDS3 pointer delegation, display-orientation/colormap decisions, HST-vs-single-band dispatch, and post-processing (rotate/gamma/size/wrap/pad). Returns stretch limits and reuse tuple for aggregation. Tests cover output creation, reuse short-circuiting, skip behavior, and display-orientation overrides.
Pipeline main loop refactor (images_to_pics)
src/picmaker/pipeline.py
Refactored to construct PicmakerOptions once with legacy None defaults backfilled, clamp reuse to single-file batches, loop over filenames calling _process_one_image, aggregate stretch limits, and return median endpoints or (None, None, None) if no limits accumulated, plus last reuse tuple. Exception handling respects proceed flag.
CLI helper functions for options and directory processing
src/picmaker/cli.py, tests/test_cli_helpers.py
_collect_option_dicts produces normalized option dicts from CLI args or by re-parsing --versions lines (enforcing top-level --replace/--proceed across all dicts), and _process_directory centralizes recursive/non-recursive directory walking, pattern filtering, output root computation, and delegation to process_images. Refactored main() calls these helpers in place of 100+ lines of inline logic. Tests validate versions-file parsing, blank-line skipping, directory traversal modes, pattern filtering, and verbose logging.
Documentation updates for new structure
docs/dev/module_layout.rst, docs/dev/pipeline.rst, docs/module.rst
Updated module layout documentation to enumerate CLI-phase helpers and pipeline per-image decomposition. Updated pipeline documentation Mermaid flowchart and text to show new helpers in context. Updated Sphinx :private-members: directives to include new private helper symbols.
Minor reader and VICAR parsing change
src/picmaker/io.py
Updated VICAR image reader to pass strict=False to VicarImage.from_file(), adjusting validation strictness during the reader-cascade.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Split pipeline.images_to_pics and cli.main into smaller helpers (#12)' clearly and concisely summarizes the main refactoring work across two key files.
Description check ✅ Passed The description covers the purpose (issue #12), lists detailed implementation changes in both cli.py and pipeline.py, includes testing/validation steps, and addresses all critical template sections comprehensively.
Linked Issues check ✅ Passed All acceptance criteria from issue #12 are met: images_to_pics body reduced to ~84 lines, cli.main to ~67 lines, new helpers have docstrings and direct unit tests, existing test suite passes with 504 tests at 92.94% coverage, and per-file ruff ignores were narrowed.
Out of Scope Changes check ✅ Passed All changes are in scope for issue #12: extracting helpers (_pds3_resolve_pointer, _hst_mosaic_rgb, _hst_wfpc2_mosaic, _hst_acs_panel_mosaic, _process_one_image, _collect_option_dicts, _process_directory), updating documentation, and adding comprehensive test coverage. A minor incidental change to io.py (VICAR strict=False) is well-justified.
Docstring Coverage ✅ Passed Docstring coverage is 98.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 13, 2026

Welcome to Codecov 🎉

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

Thanks for integrating Codecov - We've got you covered ☂️

`assert options.extension is not None` was flagged by bandit (B101)
because `python -O` strips asserts. The cast preserves the type
narrowing for mypy without introducing a runtime check that vanishes
under optimisation — matching the codebase's existing policy
(see the `# Use ValueError (not 'assert') so the check survives
'python -O'` comment in `process_images`).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/picmaker/cli.py`:
- Around line 504-509: The function _collect_option_dicts currently has four
positional parameters; update its signature so only the first three are
positional and make replace and proceed keyword-only (e.g., def
_collect_option_dicts(parser: argparse.ArgumentParser, options:
argparse.Namespace, *, replace: str, proceed: bool) -> list[dict[str, Any]]),
and update any callers to pass replace and proceed by keyword; apply the same
change to the other similar function occurrence referenced (the one around the
other occurrence) so both comply with the three-positional-parameter rule.
- Around line 599-603: When computing out_dir using out_dir, directory, this_dir
and lcommon, guard against lcommon == -1 and absolute source paths by resolving
to absolute paths and computing a relative path instead of slicing: if directory
is not None, call os.path.abspath on directory and this_dir, compute rel =
os.path.relpath(this_dir_abs, start=directory_abs) (or fallback to basename if
rel starts with ".."), then join with directory using os.path.join; after join
normalize with os.path.normpath and verify containment with
os.path.commonpath(directory_abs, out_dir_abs) == directory_abs to prevent
output-root escape; apply the same pattern to the other occurrence that uses
lcommon (the block analogous to lines 617-621).
- Around line 612-617: The current non-recursive file collection uses
os.listdir/fnmatch and can include directory names (variables: files_in_dir,
f_filtered, filepaths, dirpath, pattern) which are then passed to process_images
as files; update the logic to filter f_filtered down to actual files only before
building filepaths by checking os.path.isfile(os.path.join(dirpath, name)) (or
equivalent) so only real files are included in filepaths and directories
matching pattern are excluded.
- Around line 543-547: The code uses line.split() to tokenise --versions file
lines which breaks quoted arguments; update the tokenisation to use
shlex.split(line) so quoted suffixes/spaces are preserved and CLI semantics
match parser.parse_args usage; add an import shlex (if missing) and replace the
use of line.split() where new_args is assigned (refer to variable new_args and
the parser.parse_args(sys.argv[1:] + new_args) call) while preserving the
existing empty-check and continue behavior.

In `@src/picmaker/io.py`:
- Line 167: Add a brief inline comment directly above the VicarImage.from_file
call explaining why strict=False is required: note that some VICAR files
produced by X (or legacy/variably-formatted VICAR variants) contain
extra/nonstandard metadata or slight header deviations that would be rejected by
strict parsing, so we use strict=False (with extraneous='print') to accept those
files and log the extraneous fields; reference the call
VicarImage.from_file(filename_str, extraneous='print', strict=False) in the
comment.

In `@src/picmaker/pipeline.py`:
- Around line 312-343: The per-band processing chain (slice_array → optional
fill_zebra_stripes → get_limits → apply_colormap) is duplicated between
_hst_mosaic_rgb and _process_one_image; extract that sequence into a shared
helper (e.g., process_band_slice or build_rgb_slice) that takes the input
arguments (array3d or array2d slice indices, options, colormap, is_int, bands
where applicable) and returns (array_rgb, these_limits) so both call sites call
the helper and then do their existing append/limits bookkeeping; update callers
in _hst_mosaic_rgb (the loop over b using array3d) and _process_one_image
(single-band branch) to use the new helper and remove the duplicated code
(references: slice_array, fill_zebra_stripes, get_limits, apply_colormap,
arrays_rgb, bands, options).
- Line 749: The conditional currently uses ambiguous truthiness for the sequence
"min_limits" (if not min_limits); change it to an explicit empty-length check by
replacing that test with "if len(min_limits) == 0:" so the code follows the
guideline to use explicit length checks for empty sequences—locate the
conditional that references the variable min_limits in src/picmaker/pipeline.py
and update it accordingly.
- Around line 160-174: The custom IndexError is unreachable because pds_obj is
indexed before validating bounds; reorder logic in src/picmaker/pipeline.py so
you first compute max_obj from obj (for None set max_obj = len(pds_obj)-1; for
int set max_obj = obj; for sequence set max_obj = max(obj)), then check if
max_obj >= len(pds_obj) and raise the informative IndexError using pname if out
of range, and only after that construct imagefile (using pds_obj[obj] or list
comprehensions) to avoid Python's native IndexError; also tighten the
corresponding test to assert the pointer name (e.g. match=r'index \d+ for PDS
pointer IMAGE') so the custom message is verified.

In `@tests/test_cli_helpers.py`:
- Around line 133-135: The test currently checks existence by doing found =
list(out_dir.rglob('cassini_iss.jpg')) and assert found, which is too loose;
change it to assert the exact expected paths instead (e.g. compute found =
sorted(out_dir.rglob('cassini_iss.jpg')) or use list and compare to an explicit
list like [out_dir / 'cassini_iss.jpg']) so the assertion verifies exact output
paths and not merely presence; apply the same exact-path equality change to the
similar check around lines 188-189.
- Line 209: Replace the combined assertion that uses "or" with a
single-condition assertion guarded by an if: check out_dir.exists() first and,
if it does, assert that not any(out_dir.iterdir()); otherwise allow the
non-existence case (no assertion needed) — i.e., use an if out_dir.exists():
assert not any(out_dir.iterdir()) pattern to eliminate the combined assert
involving out_dir.exists() and out_dir.iterdir().

In `@tests/test_pipeline_helpers.py`:
- Line 318: Add a brief justification comment to each existing "# type:
ignore[arg-type]" so readers know why mypy is being bypassed; specifically
update the ignore on the return PicmakerOptions(**base) (and the similar ignore
at the other helper) to read "# type: ignore[arg-type]  # <short reason>" where
the reason explains that a dict[str, object] is being unpacked into a dataclass
with narrower typed fields (e.g., "dict[str, object] unpacks into typed
dataclass fields"), keeping the justification short and to the point.
- Around line 321-340: The test docstring claims both colored and grayscale
branches but the test only exercises the colored branch; either update the
docstring to describe only the colored-case behavior or extend the test to cover
both branches by parametrizing test_hst_mosaic_rgb_wfpc2_produces_2x2_mosaic
over colormap values and invoking _hst_mosaic_rgb with colormap=None (asserting
mosaic.shape == (8,8,1) and returned.shape unchanged) as well as with
colormap='red-blue' (asserting (8,8,3)); modify assertions and the call sites to
match each branch accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: facd64d5-ebf6-4c17-9ee9-4f10b6f25f25

📥 Commits

Reviewing files that changed from the base of the PR and between fe56198 and 67af22b.

📒 Files selected for processing (8)
  • docs/dev/module_layout.rst
  • docs/dev/pipeline.rst
  • docs/module.rst
  • src/picmaker/cli.py
  • src/picmaker/io.py
  • src/picmaker/pipeline.py
  • tests/test_cli_helpers.py
  • tests/test_pipeline_helpers.py

Comment thread src/picmaker/cli.py
Comment thread src/picmaker/cli.py Outdated
Comment thread src/picmaker/cli.py
Comment thread src/picmaker/cli.py
Comment thread src/picmaker/io.py
Comment thread src/picmaker/pipeline.py Outdated
Comment thread tests/test_cli_helpers.py Outdated
Comment thread tests/test_cli_helpers.py Outdated
Comment thread tests/test_pipeline_helpers.py Outdated
Comment thread tests/test_pipeline_helpers.py Outdated
rfrenchseti and others added 3 commits May 13, 2026 11:03
- Make `_collect_option_dicts`' `replace` / `proceed` keyword-only so
  the helper respects the ≤3-positional-parameters rule.
- Promote the `_basic_option_dict()` factory duplicated in
  `test_pipeline_branches.py` and `test_cli_helpers.py` to a
  `basic_option_dict` fixture in `conftest.py`; both files now consume
  it via the fixture parameter.
- Tighten cross-references in the `_process_one_image` docstring
  (``images_to_pics`` → `:func:`~picmaker.pipeline.images_to_pics``)
  and drop the stale `picmaker.py:543` / `picmaker.py:548` line-number
  references in `_collect_option_dicts`.
- Replace `if not <list>:` with explicit `if len(<list>) == 0:` in
  `pipeline.images_to_pics`, `cli._process_directory`, and
  `cli._collect_option_dicts` per the falsy-check rule.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Collapse the three `test_pds3_resolve_pointer_obj_*` tests into one
  `@pytest.mark.parametrize`-driven test over (obj, expected_tails);
  keep the no-DETECTOR_ID inst_id check as its own focused test.
- Collapse the three WFPC2 mosaic tests into one parametrized test
  over (imagefile, expected_quadrant_fills) using a shared quadrant
  slice map.
- Collapse the two ACS panel mosaic tests into one parametrized test
  over (imagefile, top_fill, bottom_fill).
- In `_process_directory` docstring, qualify `:func:\`process_images\``
  → `:func:\`~picmaker.pipeline.process_images\`` so the ref survives
  Sphinx's nitpicky mode and stays consistent with the rest of the
  module's docstring style.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
All eight findings were verified valid against current code; each fix
is minimal and preserves existing behaviour.

1. cli._collect_option_dicts: tokenise versions-file lines with
   shlex.split (was str.split); quoted values with embedded whitespace
   now round-trip through argparse the same way they do on the shell.
   Regression test added.

2. cli._process_directory non-recursive branch: filter os.listdir
   output to actual files via os.path.isfile so a subdirectory whose
   name matches `pattern` no longer flows into process_images as if it
   were a file. (The recursive branch already gets this for free from
   os.walk.) Regression test added.

3. pipeline._pds3_resolve_pointer: reorder the obj-bounds check so the
   informative custom IndexError ('index N for PDS pointer X out of
   range') fires before Python's bare 'list index out of range'.
   Existing test's `match='out of range'` was masking the dead code;
   tightened to match the pointer name.

4. pipeline._band_to_rgb: extract the slice → optional zebra fill →
   get_limits → apply_colormap chain into a shared helper. Both
   _hst_mosaic_rgb (per-detector loop) and _process_one_image
   (single-band branch) call it; 30+ duplicate lines eliminated.

5. test_cli_helpers: replace existence-only `rglob(...)` checks with
   exact-path equality on `sorted(rglob(...))` so the assertions
   verify the produced filenames, not just presence.

6. test_cli_helpers test_process_directory_no_match_is_noop: replace
   the combined `assert not X or not Y` with an explicit `if X:` guard
   so each assertion tests one condition.

7. test_pipeline_helpers: add brief justifications to the existing
   `# type: ignore[arg-type]` lines per the project's rule that
   line-level ignores must explain themselves.

8. test_pipeline_helpers test_hst_mosaic_rgb_wfpc2_produces_2x2_mosaic:
   the docstring described both colored and grayscale branches but
   the test only exercised the colored one. Parametrize over
   ('red-blue', 3) and (None, 1) so both branches are now covered.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rfrenchseti rfrenchseti merged commit a14a611 into main May 13, 2026
13 checks passed
@rfrenchseti rfrenchseti deleted the rf_260512 branch May 13, 2026 18:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Split pipeline.images_to_pics and cli.main into smaller helpers

1 participant