Split pipeline.images_to_pics and cli.main into smaller helpers (#12)#24
Conversation
Extract the per-file body of `images_to_pics` into `_process_one_image`, the inline PDS3 detached-label pointer-resolution block into `_pds3_resolve_pointer`, and the HST ACS/WFC + WFPC2 mosaic-assembly block into `_hst_mosaic_rgb` (with `_hst_wfpc2_mosaic` and `_hst_acs_panel_mosaic` as geometry sub-helpers). Extract the `--versions` re-parse loop and the per-directory walk from `cli.main` into `_collect_option_dicts` and `_process_directory`. Each new helper gets direct unit-test coverage in `tests/test_pipeline_helpers.py` and `tests/test_cli_helpers.py`, and the developer-guide pipeline / module-layout docs and the module API reference are updated to point at the new helpers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (6)
WalkthroughThis PR refactors two large functions in the codebase— ChangesPipeline and CLI refactoring with helper extraction
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Welcome to Codecov 🎉Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests. Thanks for integrating Codecov - We've got you covered ☂️ |
`assert options.extension is not None` was flagged by bandit (B101) because `python -O` strips asserts. The cast preserves the type narrowing for mypy without introducing a runtime check that vanishes under optimisation — matching the codebase's existing policy (see the `# Use ValueError (not 'assert') so the check survives 'python -O'` comment in `process_images`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 12
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/picmaker/cli.py`:
- Around line 504-509: The function _collect_option_dicts currently has four
positional parameters; update its signature so only the first three are
positional and make replace and proceed keyword-only (e.g., def
_collect_option_dicts(parser: argparse.ArgumentParser, options:
argparse.Namespace, *, replace: str, proceed: bool) -> list[dict[str, Any]]),
and update any callers to pass replace and proceed by keyword; apply the same
change to the other similar function occurrence referenced (the one around the
other occurrence) so both comply with the three-positional-parameter rule.
- Around line 599-603: When computing out_dir using out_dir, directory, this_dir
and lcommon, guard against lcommon == -1 and absolute source paths by resolving
to absolute paths and computing a relative path instead of slicing: if directory
is not None, call os.path.abspath on directory and this_dir, compute rel =
os.path.relpath(this_dir_abs, start=directory_abs) (or fallback to basename if
rel starts with ".."), then join with directory using os.path.join; after join
normalize with os.path.normpath and verify containment with
os.path.commonpath(directory_abs, out_dir_abs) == directory_abs to prevent
output-root escape; apply the same pattern to the other occurrence that uses
lcommon (the block analogous to lines 617-621).
- Around line 612-617: The current non-recursive file collection uses
os.listdir/fnmatch and can include directory names (variables: files_in_dir,
f_filtered, filepaths, dirpath, pattern) which are then passed to process_images
as files; update the logic to filter f_filtered down to actual files only before
building filepaths by checking os.path.isfile(os.path.join(dirpath, name)) (or
equivalent) so only real files are included in filepaths and directories
matching pattern are excluded.
- Around line 543-547: The code uses line.split() to tokenise --versions file
lines which breaks quoted arguments; update the tokenisation to use
shlex.split(line) so quoted suffixes/spaces are preserved and CLI semantics
match parser.parse_args usage; add an import shlex (if missing) and replace the
use of line.split() where new_args is assigned (refer to variable new_args and
the parser.parse_args(sys.argv[1:] + new_args) call) while preserving the
existing empty-check and continue behavior.
In `@src/picmaker/io.py`:
- Line 167: Add a brief inline comment directly above the VicarImage.from_file
call explaining why strict=False is required: note that some VICAR files
produced by X (or legacy/variably-formatted VICAR variants) contain
extra/nonstandard metadata or slight header deviations that would be rejected by
strict parsing, so we use strict=False (with extraneous='print') to accept those
files and log the extraneous fields; reference the call
VicarImage.from_file(filename_str, extraneous='print', strict=False) in the
comment.
In `@src/picmaker/pipeline.py`:
- Around line 312-343: The per-band processing chain (slice_array → optional
fill_zebra_stripes → get_limits → apply_colormap) is duplicated between
_hst_mosaic_rgb and _process_one_image; extract that sequence into a shared
helper (e.g., process_band_slice or build_rgb_slice) that takes the input
arguments (array3d or array2d slice indices, options, colormap, is_int, bands
where applicable) and returns (array_rgb, these_limits) so both call sites call
the helper and then do their existing append/limits bookkeeping; update callers
in _hst_mosaic_rgb (the loop over b using array3d) and _process_one_image
(single-band branch) to use the new helper and remove the duplicated code
(references: slice_array, fill_zebra_stripes, get_limits, apply_colormap,
arrays_rgb, bands, options).
- Line 749: The conditional currently uses ambiguous truthiness for the sequence
"min_limits" (if not min_limits); change it to an explicit empty-length check by
replacing that test with "if len(min_limits) == 0:" so the code follows the
guideline to use explicit length checks for empty sequences—locate the
conditional that references the variable min_limits in src/picmaker/pipeline.py
and update it accordingly.
- Around line 160-174: The custom IndexError is unreachable because pds_obj is
indexed before validating bounds; reorder logic in src/picmaker/pipeline.py so
you first compute max_obj from obj (for None set max_obj = len(pds_obj)-1; for
int set max_obj = obj; for sequence set max_obj = max(obj)), then check if
max_obj >= len(pds_obj) and raise the informative IndexError using pname if out
of range, and only after that construct imagefile (using pds_obj[obj] or list
comprehensions) to avoid Python's native IndexError; also tighten the
corresponding test to assert the pointer name (e.g. match=r'index \d+ for PDS
pointer IMAGE') so the custom message is verified.
In `@tests/test_cli_helpers.py`:
- Around line 133-135: The test currently checks existence by doing found =
list(out_dir.rglob('cassini_iss.jpg')) and assert found, which is too loose;
change it to assert the exact expected paths instead (e.g. compute found =
sorted(out_dir.rglob('cassini_iss.jpg')) or use list and compare to an explicit
list like [out_dir / 'cassini_iss.jpg']) so the assertion verifies exact output
paths and not merely presence; apply the same exact-path equality change to the
similar check around lines 188-189.
- Line 209: Replace the combined assertion that uses "or" with a
single-condition assertion guarded by an if: check out_dir.exists() first and,
if it does, assert that not any(out_dir.iterdir()); otherwise allow the
non-existence case (no assertion needed) — i.e., use an if out_dir.exists():
assert not any(out_dir.iterdir()) pattern to eliminate the combined assert
involving out_dir.exists() and out_dir.iterdir().
In `@tests/test_pipeline_helpers.py`:
- Line 318: Add a brief justification comment to each existing "# type:
ignore[arg-type]" so readers know why mypy is being bypassed; specifically
update the ignore on the return PicmakerOptions(**base) (and the similar ignore
at the other helper) to read "# type: ignore[arg-type] # <short reason>" where
the reason explains that a dict[str, object] is being unpacked into a dataclass
with narrower typed fields (e.g., "dict[str, object] unpacks into typed
dataclass fields"), keeping the justification short and to the point.
- Around line 321-340: The test docstring claims both colored and grayscale
branches but the test only exercises the colored branch; either update the
docstring to describe only the colored-case behavior or extend the test to cover
both branches by parametrizing test_hst_mosaic_rgb_wfpc2_produces_2x2_mosaic
over colormap values and invoking _hst_mosaic_rgb with colormap=None (asserting
mosaic.shape == (8,8,1) and returned.shape unchanged) as well as with
colormap='red-blue' (asserting (8,8,3)); modify assertions and the call sites to
match each branch accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: facd64d5-ebf6-4c17-9ee9-4f10b6f25f25
📒 Files selected for processing (8)
docs/dev/module_layout.rstdocs/dev/pipeline.rstdocs/module.rstsrc/picmaker/cli.pysrc/picmaker/io.pysrc/picmaker/pipeline.pytests/test_cli_helpers.pytests/test_pipeline_helpers.py
- Make `_collect_option_dicts`' `replace` / `proceed` keyword-only so the helper respects the ≤3-positional-parameters rule. - Promote the `_basic_option_dict()` factory duplicated in `test_pipeline_branches.py` and `test_cli_helpers.py` to a `basic_option_dict` fixture in `conftest.py`; both files now consume it via the fixture parameter. - Tighten cross-references in the `_process_one_image` docstring (``images_to_pics`` → `:func:`~picmaker.pipeline.images_to_pics``) and drop the stale `picmaker.py:543` / `picmaker.py:548` line-number references in `_collect_option_dicts`. - Replace `if not <list>:` with explicit `if len(<list>) == 0:` in `pipeline.images_to_pics`, `cli._process_directory`, and `cli._collect_option_dicts` per the falsy-check rule. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Collapse the three `test_pds3_resolve_pointer_obj_*` tests into one `@pytest.mark.parametrize`-driven test over (obj, expected_tails); keep the no-DETECTOR_ID inst_id check as its own focused test. - Collapse the three WFPC2 mosaic tests into one parametrized test over (imagefile, expected_quadrant_fills) using a shared quadrant slice map. - Collapse the two ACS panel mosaic tests into one parametrized test over (imagefile, top_fill, bottom_fill). - In `_process_directory` docstring, qualify `:func:\`process_images\`` → `:func:\`~picmaker.pipeline.process_images\`` so the ref survives Sphinx's nitpicky mode and stays consistent with the rest of the module's docstring style. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
All eight findings were verified valid against current code; each fix
is minimal and preserves existing behaviour.
1. cli._collect_option_dicts: tokenise versions-file lines with
shlex.split (was str.split); quoted values with embedded whitespace
now round-trip through argparse the same way they do on the shell.
Regression test added.
2. cli._process_directory non-recursive branch: filter os.listdir
output to actual files via os.path.isfile so a subdirectory whose
name matches `pattern` no longer flows into process_images as if it
were a file. (The recursive branch already gets this for free from
os.walk.) Regression test added.
3. pipeline._pds3_resolve_pointer: reorder the obj-bounds check so the
informative custom IndexError ('index N for PDS pointer X out of
range') fires before Python's bare 'list index out of range'.
Existing test's `match='out of range'` was masking the dead code;
tightened to match the pointer name.
4. pipeline._band_to_rgb: extract the slice → optional zebra fill →
get_limits → apply_colormap chain into a shared helper. Both
_hst_mosaic_rgb (per-detector loop) and _process_one_image
(single-band branch) call it; 30+ duplicate lines eliminated.
5. test_cli_helpers: replace existence-only `rglob(...)` checks with
exact-path equality on `sorted(rglob(...))` so the assertions
verify the produced filenames, not just presence.
6. test_cli_helpers test_process_directory_no_match_is_noop: replace
the combined `assert not X or not Y` with an explicit `if X:` guard
so each assertion tests one condition.
7. test_pipeline_helpers: add brief justifications to the existing
`# type: ignore[arg-type]` lines per the project's rule that
line-level ignores must explain themselves.
8. test_pipeline_helpers test_hst_mosaic_rgb_wfpc2_produces_2x2_mosaic:
the docstring described both colored and grayscale branches but
the test only exercised the colored one. Parametrize over
('red-blue', 3) and (None, 1) so both branches are now covered.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Fixes #12. Splits the two largest functions in the repo into focused helpers so each phase of the pipeline has a clear name, a clear contract, and direct unit-test coverage.
src/picmaker/pipeline.py:_pds3_resolve_pointer— the PDS3 detached-label^pointerresolution that produces(imagefile, filter_info)._hst_mosaic_rgb— the per-detector slice → colormap → mosaic flow for HST ACS/WFC and WFPC2, with_hst_wfpc2_mosaicand_hst_acs_panel_mosaicas geometry sub-helpers._process_one_image— the per-fileget_outfile→ read → slice → colormap → orient → resize → write chain.images_to_picsbody collapses to a flat loop that builds aPicmakerOptions, backfills the legacyNone-means-default kwargs, and delegates each filename to_process_one_image.src/picmaker/cli.py:_collect_option_dicts— the--versions FILEre-parse loop that produces one normalizedoption_dictper non-blank line (main CLI's--replace/--proceedalways win)._process_directory— the per-directory walk, recursive or not.maincollapses to the thin top-level orchestrator that handlesKeyboardInterrupt/SystemExit/Exception.Acceptance criteria from #12:
images_to_picsbody is < 100 lines (now ~84).cli.mainbody is < 100 lines (now ~67).pipeline.py/cli.py(theRUF005/A002/N806ignores the issue mentioned had already been removed); no new per-file ignores added.Documentation updates:
docs/dev/pipeline.rst— flowchart and prose updated to name the new helpers.docs/dev/module_layout.rst— module summary updated.docs/module.rst— new private helpers added to the Sphinx:private-members:list so they render in the API reference.Test plan
pytest— 504 passed, coverage 92.94%ruff check src/ tests/— cleanmypy src/picmaker/— clean (one newassert options.extension is not Noneto threadstr | Nonethroughget_outfile)bash scripts/run-all-checks.sh -c— all code checks passbash scripts/run-all-checks.sh -d— Sphinx build clean, PyMarkdown cleanpyroma— 10/10🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Tests