fix(ml): create null detection markers only after real saves succeed by mihow · Pull Request #1312 · RolnickLab/antenna

mihow · 2026-05-20T00:41:59Z

Summary

Null detections (empty-bbox sentinels marking "image processed, nothing found") were being created before the downstream save steps inside save_results. Two consequences:

Preemptive processed marker. If any downstream step raised, the image was already flagged as processed via the null marker. filter_processed_images would then skip it on retry, leaving the image permanently stuck as "processed, zero detections." Observed on a project where ~400 captures had only null detections and no real ones.
Phantom occurrences. create_and_update_occurrences_for_detections iterated every detection including nulls, so each null marker spawned an Occurrence with determination=NULL. Those leaked through OccurrenceQuerySet.valid() (which only excluded occurrences with zero detections, not occurrences whose only detection is a null).

Reviewer heads-up — silent semantic change to `OccurrenceQuerySet.valid()`

Commit 5 changes the meaning of Occurrence.objects.valid() from "has any detection" to "has at least one Detection.objects.valid() row AND determination is not null." Three call sites pick this up without any line change at the call site:

ami/main/api/views.py:1221 — OccurrenceViewSet.get_queryset. Intended target of the fix. Phantom occurrences stop appearing in the list endpoint.
ami/main/api/views.py:1886 — project summary stats occurrences_count. Will silently decrease on any deployment that has accumulated phantoms (e.g. project 9: 112; project 171: ~400). No-op on clean deployments.
ami/exports/format_types.py:52,221 — DwC-A export. Null-determination occurrences will be excluded from exports. Probably correct (DwC requires taxonID) but not validated against an actual export run in this PR.

If any of those three are load-bearing in a way I'm missing, flag it.

Changes (7 commits)

test(ml) — RED test for broker-outage path: asserts null marker is never persisted if create_detection_images.delay raises and filter_processed_images re-yields the image.
fix(ml) — move null persistence to the absolute final step in save_results. Null markers now run after source_image.save() loop, create_detection_images.delay(), update_calculated_fields_for_events, and Deployment.update_calculated_fields(save=True). Closes the silent-bug window the prior reorder left open (Copilot review caught this on the original PR).
refactor(main) — null-marker abstraction on Detection.
- Detection.NULL_BBOX = None — canonical sentinel value for new writes.
- Detection.is_null_marker property — recognises both bbox=None and legacy bbox=[].
- Detection.build_null_marker(source_image, detection_algorithm) classmethod — single construction point.
- DetectionQuerySet.valid() — consumer default (excludes null markers; named to grow with soft-delete / missing-algorithm / missing-classifications predicates over time).
- DetectionQuerySet.null_markers() — narrow, for "has this image been processed?" checks. Renamed from .null_detections().
refactor(main) — sweep 9 inline NULL_DETECTIONS_FILTER call sites to the new manager methods. Migrates ami/main/models.py, ami/main/api/views.py, ami/ml/models/pipeline.py. Fixes the drifted inline at SourceImageCollectionQuerySet.with_source_images_with_detections_count via a new null_detections_q(prefix) helper for relation-prefixed Q expressions.
fix(main) — tighten OccurrenceQuerySet.valid() to require at least one Detection.objects.valid() row AND a non-null determination. Closes the phantom-Occurrence leak from prod data — no code changes needed in OccurrenceViewSet or format_types.py because they already call .valid(). See "Reviewer heads-up" above for the consumers that pick up the new semantic.
feat(main) — cleanup_null_only_occurrences management command for per-project cleanup of the field bug. Dry-run default. Deletes phantom occurrences (no valid detections OR null determination) and orphan null-marker Detection rows on source images with no real detections. Idempotent.
docs(planning) — PR follow-up plan + prior session notes in docs/claude/.

Test plan

test_null_marker_not_persisted_when_broker_dispatch_fails — RED, then GREEN after move-to-end.
TestDetectionNullMarker (5 tests) — is_null_marker for None / [] / real bbox, build_null_marker field setup, valid() / null_markers() disjointness.
TestOccurrenceValidQuerySet — fixture with real / null-only-detection / null-determination occurrences, asserts valid() returns only real.
TestCleanupNullOnlyOccurrencesCommand (3 tests) — dry-run reports without deleting, --commit deletes phantoms but preserves valid rows and null markers on processed-with-real-detection images, idempotent on second run.
ami.ml.tests + ami.main.TestDetectionNullMarker + ami.jobs.tests: 176/176 pass.
ami.main.tests + ami.ml.tests + ami.jobs.tests + ami.exports.tests: 385/389 pass. The 4 failures in TestRolePermissions are pre-existing and reproduce on the prior PR head 4e33f96; unrelated to this PR.

Manual e2e (dev box)

Happy path async_api job — project 9 / collection 38 / pipeline quebec_vermont_moths_2023. 28s, 10 imgs, 68 dets, 121 classifs, 0 failed. No new phantoms.
Broker-outage simulation — patched create_detection_images.delay to raise mid-job. 0 null markers persisted, 0 phantoms; image stays in filter_processed_images yield list.
Calc-field DB error — patched update_calculated_fields_for_events to raise. Same result: 0 null markers persisted.

Cleanup command — dry-run on dev-deployment project 9 reported 112 phantom occurrences + 112 orphan null markers. Ran --commit:

counter	before	after
total `Occurrence` rows	6 029	5 917
`Occurrence.objects.valid()`	5 917	5 917
total `Detection` rows	6 029	5 917
`Detection.objects.valid()`	5 917	5 917
`Detection.objects.null_markers()`	112	0

Second dry-run reported 0 / 0 (idempotent). The originally-affected deployment is a post-merge ops step.

Out of scope — deferred follow-up

transaction.atomic() wrap. The 5-step persistence block (real detections → classifications → occurrences → calc-fields → null marker) is still composed of unrelated DB writes that can partially commit if one mid-block step raises mid-loop. PR-A here closes the ordering window (null marker writing before downstream steps); it does not close the within-block partial-commit window. A narrow transaction.atomic() wrap with transaction.on_commit for celery dispatch is the structural fix and is deferred to a separate PR because tx changes carry concurrency risk (see PR #1261 scar in ami/jobs/tasks.py:561-571: select_for_update + ATOMIC_REQUESTS contention) and need their own multi-worker e2e + clean revert path.

Dual-form bbox=None vs bbox=[]. New writes go through Detection.NULL_BBOX = None. Legacy rows from earlier runs still carry bbox=[]. .null_markers() / .is_null_marker / null_detections_q() all recognise both, so no consumer breaks — but the dual form is permanent until a data migration backfills legacy rows. Worth a follow-up ticket; not urgent.

Re-classification gap. Unrelated but adjacent: filter_processed_images notes "we don't yet have a mechanism to reclassify detections" — current behavior reprocesses from scratch. Worth a separate ticket.

Summary by CodeRabbit

Release Notes

New Features
- Added cleanup_null_only_occurrences management command to remove phantom occurrence records and orphan null-detection markers for a specified project (dry-run by default).
Bug Fixes
- Fixed processing pipeline order: null-detection markers are now created only after classification and occurrence steps complete, preventing orphaned records from failed processing.
- Improved detection filtering consistency across API and database queries.
- Images with failed processing steps now properly retry on the next run.

Issue #1310: null detections (empty-bbox sentinels marking "image processed, nothing found") were created before create_detections / create_classifications / create_and_update_occurrences_for_detections ran. Two consequences: 1. If any of those downstream steps failed, the image was already flagged as processed via the null marker — filter_processed_images would skip it on the next run, leaving the image permanently in a "processed but no detections" state. Observed on project 171 (400 captures with only null detections). 2. create_and_update_occurrences_for_detections iterated every detection including nulls, so each null marker spawned a phantom Occurrence with determination=NULL. Fix in ami/ml/models/pipeline.py save_results: - Run create_detections / create_classifications / create_and_update_occurrences on the real DetectionResponses only. - After those succeed, build null DetectionResponses for images that ended up without any detections and persist them via a second create_detections call. - Null responses never enter the classification / occurrence loops, so no phantom Occurrence is created even in the happy path. Tests in ami/ml/tests.py TestPipeline: - test_null_detection_does_not_create_phantom_occurrence: asserts the happy path "pipeline found nothing" creates the null marker but no Occurrence. - test_captures_not_marked_processed_after_failure: asserts that when a downstream step (create_classifications) raises, the image without a real detection is left unmarked and filter_processed_images re-yields it. Co-Authored-By: Claude <noreply@anthropic.com>

netlify · 2026-05-20T00:42:04Z

✅ Deploy Preview for antenna-ssec canceled.

Name	Link
🔨 Latest commit	`dda794f`
🔍 Latest deploy log	https://app.netlify.com/projects/antenna-ssec/deploys/6a0e447437a3e20009d9c380

netlify · 2026-05-20T00:42:07Z

✅ Deploy Preview for antenna-preview canceled.

Name	Link
🔨 Latest commit	`4e33f96`
🔍 Latest deploy log	https://app.netlify.com/projects/antenna-preview/deploys/6a0d035a479bdb000845b3ed

netlify · 2026-05-20T00:42:07Z

✅ Deploy Preview for antenna-preview canceled.

Name	Link
🔨 Latest commit	`dda794f`
🔍 Latest deploy log	https://app.netlify.com/projects/antenna-preview/deploys/6a0e4474d3af6b00083fa328

coderabbitai · 2026-05-20T00:42:11Z

📝 Walkthrough

Walkthrough

Defers creation/persistence of null-marker detections until after real detections/classifications/occurrence persistence; centralizes null-marker representation and query helpers, updates call-sites to use Detection.objects.valid()/null_markers(), adds a cleanup command, and extends tests and docs.

Changes

Fix captures marked as processed with zero detections

Layer / File(s)	Summary
Detection & Occurrence model/query abstractions `ami/main/models.py`	Adds `null_detections_q`, `DetectionQuerySet.valid()` and `.null_markers()`, `Detection.NULL_BBOX`, `is_null_marker`, and `Detection.build_null_marker()`. Updates occurrence validity and several detection-count call-sites to use `.valid()` or the helper predicate.
Management command: cleanup phantom occurrences & orphan null markers `ami/main/management/commands/cleanup_null_only_occurrences.py`	New `cleanup_null_only_occurrences` command: dry-run reports counts; `--commit` deletes phantom `Occurrence` rows and orphan null-marker `Detection` rows inside a transaction.
API view query switching `ami/main/api/views.py`	Replaces `NULL_DETECTIONS_FILTER` exclusions with `Detection.objects.valid()` for prefetched detections and DetectionViewSet queryset.
Pipeline: defer null detection creation `ami/ml/models/pipeline.py`	Removes early appending of null DetectionResponses; `filter_processed_images` now checks `existing_detections.valid()`; finalizes `save_results` to create and persist null detection responses via `create_detections(...)` only after detection/classification/occurrence persistence completes.
Tests: null-marker behavior, pipeline regressions, cleanup command `ami/main/tests.py`, `ami/ml/tests.py`	Adds `TestDetectionNullMarker`, `TestOccurrenceValidQuerySet`, `TestCleanupNullOnlyOccurrencesCommand`, and pipeline regression tests verifying null markers don't create phantom Occurrence rows and are not persisted when downstream persistence or broker dispatch fails.
Docs: planning and session notes `docs/claude/*`	Planning doc and session notes describing the null-marker lifecycle, reorder in save_results, test plan, and operational verification steps.

Sequence Diagram(s)

sequenceDiagram
  participant Pipeline
  participant DB as Database
  participant DetectionSvc as DetectionService
  participant Classifier as ClassificationService
  participant Broker

  Pipeline->>DetectionSvc: run detections -> real DetectionResponses
  DetectionSvc-->>Pipeline: detection responses
  Pipeline->>Classifier: create_classifications(detections)
  Classifier-->>Pipeline: classifications
  Pipeline->>DB: create_detections(detections, classifications, occurrences)
  DB-->>Pipeline: persisted detections/classifications/occurrences
  Pipeline->>Broker: create_detection_images.delay(persisted_ids)
  Broker-->>Pipeline: dispatch ack
  alt no real detections for an image
    Pipeline->>DB: create_detections(null_detection_responses)  %% final step
    DB-->>Pipeline: persisted null markers
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

RolnickLab/antenna#1300: Modifies SourceImageCollectionQuerySet.with_source_images_with_detections_count() and may overlap with predicate/counting logic changes introduced here.

Suggested reviewers

annavik

Poem

🐰 I hopped through pipelines late at night,

where markers once jumped out too soon in fright.
Now detections wait until the last step's done,
so failures leave no phantom rows to shun.
A tidy hop — the database sleeps tight.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main change: null detection markers are now created only after real saves succeed, directly addressing the core bug fix.
Linked Issues check	✅ Passed	The PR fully addresses issue `#1310`'s core objective: null detection markers are deferred to the final step in save_results, ensuring downstream failures prevent marking images processed. Supporting changes (null-marker abstraction, queryset methods, cleanup command) align with the secondary remediation objectives.
Out of Scope Changes check	✅ Passed	All changes are directly related to fixing issue `#1310`. The null-marker abstraction, queryset methods, API updates, and cleanup command are all necessary supporting changes for the core fix. Out-of-scope items are explicitly deferred in the description.
Docstring Coverage	✅ Passed	Docstring coverage is 88.89% which is sufficient. The required threshold is 80.00%.
Description check	✅ Passed	The pull request provides a comprehensive description that covers all required template sections including summary, list of changes, related issues, detailed explanation, test plan, deployment notes, and checklist.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/premptive-processed-marker

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

This PR fixes a pipeline persistence ordering bug where null (bbox=None) “processed, nothing found” detection markers were created too early, causing images to be skipped on retry after downstream failures and occasionally creating phantom Occurrence rows tied only to null detections.

Changes:

Reorders save_results to persist real detections/classifications/occurrences first, then creates null detection markers in a second pass.
Ensures null detections never enter the classification/occurrence creation paths.
Adds regression tests for “no phantom occurrence on null” and “no processed marker after failure”.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`ami/ml/models/pipeline.py`	Moves null-marker creation to after real detection/classification/occurrence persistence and saves nulls via a separate `create_detections` call.
`ami/ml/tests.py`	Adds tests covering the phantom-occurrence regression and retry behavior after a simulated downstream failure.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Plan for the takeaway-review follow-up work on PR #1312: move-null-to-end + null-detection abstraction (DetectionQuerySet.valid / .null_markers, Detection.is_null_marker, Detection.build_null_marker) + sweep call sites + tighten OccurrenceQuerySet.valid + cleanup command for project 171. Captures rationale for splitting transaction.atomic into a separate follow-up PR (PR-1261 scar). Co-Authored-By: Claude <noreply@anthropic.com>

Adds test_null_marker_not_persisted_when_broker_dispatch_fails to TestPipeline. Patches create_detection_images.delay to raise, asserts the unmatched image has no null marker persisted and that filter_processed_images yields it for re-processing. Verified RED against current ordering — null persistence still runs before delay, so the assertion fails. Next commit moves null persistence to the absolute final step. Co-Authored-By: Claude <noreply@anthropic.com>

Closes the failure window the previous fix left open: null markers were persisted after real-detection / classification / occurrence saves but BEFORE source_image.save, create_detection_images.delay, update_calculated_fields_for_events, and Deployment.update_calculated_fields. A raise in any of those four steps (broker outage, DB error) still left the image flagged as processed. Null markers now run as the last write in save_results so they only persist when every prior step succeeds. Remaining failure window is the return statement. Makes RED test from prior commit pass. Co-Authored-By: Claude <noreply@anthropic.com>

Introduces a single source of truth for "this detection row is a sentinel that records that an algorithm ran against an image and found nothing": - Detection.NULL_BBOX = None (canonical bbox value for new null markers) - Detection.is_null_marker (recognises both bbox=None and legacy bbox=[]) - Detection.build_null_marker(source_image, detection_algorithm) classmethod - DetectionQuerySet.valid() — consumer default, excludes null markers - DetectionQuerySet.null_markers() — narrow, for "has this image been processed?" checks (renamed from .null_detections()) valid() is named to grow: future predicates to fold in include soft-delete tombstones, detections missing an algorithm reference, and detections missing classifications. Consumers asking "give me detections" should default to .valid(). Adds TestDetectionNullMarker covering: is_null_marker for bbox=None / bbox=[] / real bbox, build_null_marker field setup, and disjointness of .valid() / .null_markers() over a fixture with all three row types. Next commit sweeps existing inline NULL_DETECTIONS_FILTER usage to the new API. Co-Authored-By: Claude <noreply@anthropic.com>

…null_markers() Migrates 7 inline NULL_DETECTIONS_FILTER usages to the new manager methods: - Detection.objects.exclude(NULL_DETECTIONS_FILTER) → .valid() - self.detections.exclude(NULL_DETECTIONS_FILTER) → self.detections.all().valid() - subquery .exclude(NULL_DETECTIONS_FILTER) → .valid() - aggregate filter at SourceImageCollectionQuerySet.with_source_images_with_detections_count was the drifted inline ~Q(...bbox__isnull=True) & ~Q(...bbox=[]); now uses a new null_detections_q(prefix) helper for relation-prefixed Q expressions. Touched: - ami/main/models.py: Deployment.get_detections_count, Event.get_detections_count, SourceImage.create_occurrences_from_detections, _annotate_detections_count_subquery, SourceImageCollectionQuerySet.with_source_images_with_detections_count - ami/main/api/views.py: OccurrenceViewSet prefetch_queryset, DetectionViewSet.queryset - ami/ml/models/pipeline.py: filter_processed_images null-only and unclassified checks NULL_DETECTIONS_FILTER constant is retained at module level. Direct get_or_create_detection lookup keeps bbox__isnull=True (algorithm-scoped, narrower than .null_markers() which also includes legacy bbox=[] from other pipelines); added a comment pointing readers to Detection.NULL_BBOX for the canonical sentinel. 176/176 tests in ami.ml, ami.main.TestDetectionNullMarker, ami.jobs pass. Co-Authored-By: Claude <noreply@anthropic.com>

OccurrenceQuerySet.valid() previously only excluded occurrences with no detections at all. Field bug from Issue #1310 created two new phantom shapes that still leaked to the API: 1. Occurrences whose only detections are null-marker sentinels (no real bounding box backing them). 2. Occurrences with determination__isnull=True. valid() now requires at least one .valid() Detection (real, non-null) AND a non-null determination. Built on top of the new Detection.objects.valid() helper so both layers stay in sync as the predicate grows (soft-delete, missing algo, etc.). Downstream callers updated automatically: ami/exports/format_types.py and OccurrenceViewSet.get_queryset both invoke OccurrenceQuerySet.valid and will now filter out the project-171 phantoms once existing rows are cleaned up (next commit). Adds TestOccurrenceValidQuerySet covering all three exclusion shapes. Co-Authored-By: Claude <noreply@anthropic.com>

One-shot per-project cleanup for the Issue #1310 field bug. Deletes: - Phantom occurrences (no valid detections OR null determination) - Orphan null-marker Detection rows on source images with no real detections After running, the affected source images become eligible for re-processing by filter_processed_images on the next ML run. Dry-run by default; pass --commit to delete. python manage.py cleanup_null_only_occurrences --project 171 # dry-run python manage.py cleanup_null_only_occurrences --project 171 --commit Idempotent — re-running on a cleaned project reports zero candidates and exits without touching the database. Adds TestCleanupNullOnlyOccurrencesCommand covering dry-run, commit, and idempotency. Valid occurrences and null markers on images with at least one real detection are explicitly preserved. Co-Authored-By: Claude <noreply@anthropic.com>

mihow · 2026-05-20T23:19:48Z

Claude says: Pushed takeaway-review follow-up commits on top of the original fix. Summary of what's new since the original push:

Commits added (e5f004a is the new HEAD):

test(ml) — RED test for broker-outage leaving null marker (create_detection_images.delay raises)
fix(ml) — move null persistence to the absolute final step in save_results, after calc-field updates and celery dispatch. Closes the silent-bug window @copilot caught.
refactor(main) — add Detection.NULL_BBOX, Detection.is_null_marker, Detection.build_null_marker, DetectionQuerySet.valid() (consumer default), DetectionQuerySet.null_markers() (narrow processed-check). .valid() is named to absorb future predicates (soft-delete, missing algo).
refactor(main) — sweep all inline NULL_DETECTIONS_FILTER call sites to the new manager methods. Fixes the drifted ~Q(...bbox__isnull=True) & ~Q(...bbox=[]) aggregate filter via a new null_detections_q(prefix) helper.
fix(main) — tighten OccurrenceQuerySet.valid() to require ≥1 valid Detection AND non-null determination. Closes the phantom-Occurrence API leak from the field-bug data.
feat(main) — cleanup_null_only_occurrences management command (dry-run default, idempotent) for cleaning up the existing field-bug rows.

What's NOT in this PR (deferred to follow-up):

transaction.atomic() wrap around the persistence block. The remaining failure surface is "mid-block raise leaves partial state committed." A narrow atomic() with on_commit for celery dispatch is the structural fix, but tx changes carry the PR-fix(jobs): fixes for concurrent ML processing jobs #1261 concurrency scar (select_for_update + ATOMIC_REQUESTS contention) and need their own multi-worker e2e plan. This PR closes ~95% of the failure window by moving null persistence to the very end without taking on that concurrency risk.

Local test status:

176/176 in ami.ml.tests + ami.main.TestDetectionNullMarker + ami.jobs.tests
385/389 in the broader sweep. The 4 TestRolePermissions failures reproduce on the prior PR head 4e33f96 — they're pre-existing, unrelated to this PR.

Manual e2e on a dev deployment is queued next.

mihow · 2026-05-20T23:25:01Z

Claude says: E2E validation on a dev deployment complete. All three paths pass.

Path 1 — Happy path (async_api job)

test_ml_job_e2e --project 9 --collection 38 --pipeline quebec_vermont_moths_2023 --dispatch-mode async_api

Job completed in 28.21s. 10 images, 68 detections, 121 classifications, 0 failed. NATS path through the new ordering works end-to-end.

Path 2 — Broker outage simulation

Live save_results call inside transaction.atomic() with create_detection_images.delay patched to raise RuntimeError. Pipeline: quebec_vermont_moths_2023. Two source images: one with a real detection in the response, one without.

=== broker outage ===
BEFORE:        null_dets=0  phantoms=0
DURING-TX:     null_dets=0  phantoms=0    ← image without real-detection NEVER gets a null marker
AFTER-ROLLBACK:null_dets=0  phantoms=0
DELTA-NULL inside tx: 0

Pre-fix behavior would have shown DURING-TX: null_dets=1 because the null marker was persisted before create_detection_images.delay ran. With null persistence moved to the absolute final step, the marker is never written.

Path 3 — Calc-field DB error

Same shape, with update_calculated_fields_for_events patched to raise instead.

=== calc-field DB error ===
BEFORE:        null_dets=0  phantoms=0
DURING-TX:     null_dets=0  phantoms=0
AFTER-ROLLBACK:null_dets=0  phantoms=0
DELTA-NULL inside tx: 0

Same result — null marker never persisted because the failure runs before null creation in the new order.

Cleanup command dry-run

Ran on dev DB to validate idempotent dry-run behavior (the real project-171 cleanup will run after merge against a prod DB):

$ python manage.py cleanup_null_only_occurrences --project 9
Project #9 (Workshop Test - Michael):
  Phantom occurrences (no valid detection or null determination): 112
  Orphan null-marker detections on images with no real detections: 112
Dry run — pass --commit to delete.

The 112 phantoms match the delta between Project.occurrences.count() and Project.occurrences.valid().count() after this PR's OccurrenceQuerySet.valid() tightening — confirms the new .valid() predicate identifies exactly the rows the cleanup command targets.

Branch is in good shape

Test counts:

14/14 TestDetectionNullMarker / TestOccurrenceValidQuerySet / TestCleanupNullOnlyOccurrencesCommand
176/176 ami.ml.tests + ami.main.TestDetectionNullMarker + ami.jobs.tests
385/389 broader sweep (4 pre-existing TestRolePermissions failures unrelated to this PR)
3 live e2e paths validated on the dev box

Ready for review.

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

ami/main/api/views.py (1)

613-628: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

has_detections still disagrees with with_detections.

This prefetch correctly drops null markers, but filter_by_has_detections() still checks for any Detection row. /captures/?has_detections=true&with_detections=true can therefore return captures whose filtered_detections is empty.

Suggested fix

     def filter_by_has_detections(self, queryset: QuerySet) -> QuerySet:
         has_detections = self.request.query_params.get("has_detections")
         if has_detections is not None:
             has_detections = BooleanField(required=False).clean(has_detections)
             queryset = queryset.annotate(
-                has_detections=models.Exists(Detection.objects.filter(source_image=models.OuterRef("pk"))),
+                has_detections=models.Exists(
+                    Detection.objects.valid().filter(source_image=models.OuterRef("pk"))
+                ),
             ).filter(has_detections=has_detections)
         return queryset

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/main/api/views.py` around lines 613 - 628, The has_detections check
currently looks for any Detection row while filtered_detections drops
null/non-qualifying rows, causing mismatches; update filter_by_has_detections to
only count detections that meet the same criteria used for the prefetch (i.e.
use the annotated Detection queryset created from Detection.objects.valid() that
includes occurrence_meets_criteria or an Exists/Subquery against
qualifying_occurrence_ids) so the filter requires existence of at least one
Detection with occurrence_meets_criteria=True (or occurrence_id in
qualifying_occurrence_ids and score >= score) rather than any Detection row;
adjust the logic referencing filter_by_has_detections,
Detection.objects.valid(), occurrence_meets_criteria, qualifying_occurrence_ids
and filtered_detections accordingly.

ami/ml/models/pipeline.py (1)

442-459: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Dedup the legacy bbox=[] null marker here too.

This lookup only matches canonical bbox IS NULL rows. If the image already has a legacy empty-list sentinel for the same algorithm, reprocessing will create a second null marker instead of reusing the existing one.

Suggested fix

-        existing_detection = Detection.objects.filter(
-            source_image=source_image,
-            bbox__isnull=True,
-            detection_algorithm=detection_algo,
-        ).first()
+        existing_detection = (
+            Detection.objects.filter(
+                source_image=source_image,
+                detection_algorithm=detection_algo,
+            )
+            .null_markers()
+            .first()
+        )

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/ml/models/pipeline.py` around lines 442 - 459, The dedupe query only
checks for bbox__isnull=True and misses legacy empty-list sentinel rows, causing
duplicate null markers; update the existing_detection lookup in the pipeline to
include both canonical NULL and the legacy sentinel by OR-ing bbox__isnull=True
with bbox equal to the legacy sentinel (use Detection.NULL_BBOX or the
empty-list value) — e.g., replace the single filter(...) call that sets
existing_detection with a query using Q(...) or bbox__in=[None,
Detection.NULL_BBOX] while keeping the same source_image and detection_algorithm
(refer to Detection, existing_detection, detection_algo, and
detection_resp.algorithm.key).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ami/main/management/commands/cleanup_null_only_occurrences.py`:
- Around line 76-82: The success message currently uses the delete() return
values (null_deleted, phantom_deleted) which include cascaded rows; replace
those with the pre-calculated counters (null_count and phantom_count) when
writing the final message—inside the same transaction.atomic() block after
orphan_null_markers.delete() and phantom_occs.delete(), call
self.stdout.write(self.style.SUCCESS(f"Deleted {phantom_count} phantom
occurrences and {null_count} orphan null markers.")) so the log reports only the
intended occurrence counts.

In `@ami/main/models.py`:
- Around line 4165-4169: The sampling for detections_only is out of sync: the
annotate uses source_images_with_detections_count with
filter=~null_detections_q("images__detections__") but
SourceImageCollection.sample_detections_only() still uses the simpler
detections__isnull=False and can include null-marker-only images; update
sample_detections_only to apply the same filter logic (i.e. use
null_detections_q("images__detections__") negated or reuse the same annotated
queryset/condition) so the sampled collection matches the
source_images_with_detections_count semantics and only includes images that the
new counter considers as having detections.
- Around line 2837-2844: The null-marker builder currently sets
timestamp=timezone.now() in Detection.build_null_marker which forces the marker
to use processing time; remove that explicit timestamp (or set timestamp=None)
so the Detection instance is created without a timestamp and Detection.save()
can backfill the correct capture timestamp; update the build_null_marker
constructor call that uses cls(NULL_BBOX, source_image, detection_algorithm,
...) accordingly and keep other fields (source_image, bbox=cls.NULL_BBOX,
detection_algorithm) unchanged.

---

Outside diff comments:
In `@ami/main/api/views.py`:
- Around line 613-628: The has_detections check currently looks for any
Detection row while filtered_detections drops null/non-qualifying rows, causing
mismatches; update filter_by_has_detections to only count detections that meet
the same criteria used for the prefetch (i.e. use the annotated Detection
queryset created from Detection.objects.valid() that includes
occurrence_meets_criteria or an Exists/Subquery against
qualifying_occurrence_ids) so the filter requires existence of at least one
Detection with occurrence_meets_criteria=True (or occurrence_id in
qualifying_occurrence_ids and score >= score) rather than any Detection row;
adjust the logic referencing filter_by_has_detections,
Detection.objects.valid(), occurrence_meets_criteria, qualifying_occurrence_ids
and filtered_detections accordingly.

In `@ami/ml/models/pipeline.py`:
- Around line 442-459: The dedupe query only checks for bbox__isnull=True and
misses legacy empty-list sentinel rows, causing duplicate null markers; update
the existing_detection lookup in the pipeline to include both canonical NULL and
the legacy sentinel by OR-ing bbox__isnull=True with bbox equal to the legacy
sentinel (use Detection.NULL_BBOX or the empty-list value) — e.g., replace the
single filter(...) call that sets existing_detection with a query using Q(...)
or bbox__in=[None, Detection.NULL_BBOX] while keeping the same source_image and
detection_algorithm (refer to Detection, existing_detection, detection_algo, and
detection_resp.algorithm.key).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 277e1e60-2244-4c0e-9ce4-f2f61f800478

📥 Commits

Reviewing files that changed from the base of the PR and between 4e33f96 and e5f004a.

📒 Files selected for processing (8)

ami/main/api/views.py
ami/main/management/commands/cleanup_null_only_occurrences.py
ami/main/models.py
ami/main/tests.py
ami/ml/models/pipeline.py
ami/ml/tests.py
docs/claude/planning/pr-1312-null-marker-followup.md
docs/claude/sessions/2026-05-19-pr-1312-premptive-processed-marker.md

✅ Files skipped from review due to trivial changes (2)

docs/claude/sessions/2026-05-19-pr-1312-premptive-processed-marker.md
docs/claude/planning/pr-1312-null-marker-followup.md

Five drift / quick-win fixes from PR #1312 review: 1. ami/main/api/views.py — filter_by_has_detections now uses Detection.objects.valid() so /captures/?has_detections=true agrees with the with_detections prefetch (which already drops null markers). Without this, has_detections=true could return captures whose filtered_detections array was empty. 2. ami/main/models.py — SourceImageCollection.sample_detections_only now samples by Detection.objects.valid() instead of detections__isnull=False, matching the tightened source_images_with_detections_count annotation. A detections_only collection no longer admits images that have only null markers. 3. ami/main/models.py — Detection.build_null_marker drops timestamp=timezone.now(). Detection.save() backfills timestamp from the source image's capture time, so the marker sorts/filters by capture time rather than processing time. Test asserts timestamp is None on the builder output. 4. ami/ml/models/pipeline.py — get_or_create_detection null-marker dedup now goes through .null_markers() so legacy bbox=[] sentinels from older runs are re-used. The lookup is still detection_algorithm-scoped, so the wider .null_markers() predicate stays narrow at the call site (no false matches across algorithms). 5. ami/main/management/commands/cleanup_null_only_occurrences.py — success message now reports the pre-calculated phantom_count / null_count instead of the .delete() return tuple, which includes cascade-deleted rows and would mislead the operator about what the command targeted. 68/68 in ami.main null-marker tests + ami.ml.tests pass. Co-Authored-By: Claude <noreply@anthropic.com>

mihow · 2026-05-20T23:33:16Z

Claude says: Pushed CodeRabbit feedback round in dda794f8. Five fixes applied:

Inline-thread items (3) — resolved on each thread:

models.py:2843 build_null_marker — dropped timestamp=timezone.now() so Detection.save() backfills from capture time
models.py:4168 sample_detections_only — now samples via Detection.objects.valid().values("source_image_id") to align with the tightened count semantics
cleanup_null_only_occurrences.py:82 — success message uses pre-calculated counts, with a comment explaining .delete() cascade-count inflation

Outside-diff items (2) — applied in same commit:

views.py:613-628 filter_by_has_detections — now uses Detection.objects.valid() so /captures/?has_detections=true agrees with the with_detections prefetch that already excludes null markers. Without this fix, a capture with only null markers would surface as "has detections" but its filtered_detections array would be empty.
pipeline.py:442-459 null-marker dedup in get_or_create_detection — switched to .null_markers() so legacy bbox=[] sentinels from older runs are reused. The lookup is still detection_algorithm-scoped, so widening bbox to OR [] doesn't cross algorithm boundaries.

68/68 in ami.main null-marker tests + ami.ml.tests pass post-fix. CI green pre-fix; re-running now.

mihow · 2026-05-21T00:17:23Z

Claude says: Cleanup command validated on a dev deployment (project 9) — 112 phantom occurrences + 112 orphan null markers deleted, post-cleanup counts match valid() queryset, second dry-run reports 0/0 (idempotent). Real-data confirmation now in the PR body's e2e section. reviewDecision is still REVIEW_REQUIRED — pinging for human eyes.

Copilot AI review requested due to automatic review settings May 20, 2026 00:42

Copilot started reviewing on behalf of mihow May 20, 2026 00:42 View session

Copilot AI reviewed May 20, 2026

View reviewed changes

Comment thread ami/ml/models/pipeline.py Outdated

mihow mentioned this pull request May 20, 2026

docs: dev-box live testing pattern + reprocess flag interaction [abandoned] #1314

Closed

mihow and others added 7 commits May 20, 2026 15:45

Copilot started work on behalf of mihow May 20, 2026 23:19 View session

Copilot finished work on behalf of mihow May 20, 2026 23:20

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

Comment thread ami/main/management/commands/cleanup_null_only_occurrences.py Outdated

Comment thread ami/main/models.py

Comment thread ami/main/models.py

Conversation

mihow commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Reviewer heads-up — silent semantic change to OccurrenceQuerySet.valid()

Changes (7 commits)

Test plan

Manual e2e (dev box)

Out of scope — deferred follow-up

Summary by CodeRabbit

Release Notes

Uh oh!

netlify Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for antenna-ssec canceled.

Uh oh!

netlify Bot commented May 20, 2026

✅ Deploy Preview for antenna-preview canceled.

Uh oh!

netlify Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for antenna-preview canceled.

Uh oh!

coderabbitai Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

mihow commented May 20, 2026

Uh oh!

mihow commented May 20, 2026

Path 1 — Happy path (async_api job)

Path 2 — Broker outage simulation

Path 3 — Calc-field DB error

Cleanup command dry-run

Branch is in good shape

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mihow commented May 20, 2026

Uh oh!

mihow commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mihow commented May 20, 2026 •

edited

Loading

Reviewer heads-up — silent semantic change to `OccurrenceQuerySet.valid()`

netlify Bot commented May 20, 2026 •

edited

Loading

netlify Bot commented May 20, 2026 •

edited

Loading

coderabbitai Bot commented May 20, 2026 •

edited

Loading