feat(taxa): per-taxon verification & agreement counts + verified filter (#1316) by mihow · Pull Request #1317 · RolnickLab/antenna

mihow · 2026-05-22T04:46:45Z

Implements #1316 — per-taxon verification / model-agreement data on the taxa list + detail, with matching UI.

Backend (`GET /api/v2/taxa/`)

verified_count and agreed_with_prediction_count — always-on annotations. Both roll up descendant occurrences (a Family/Order row aggregates its species), occurrence-weighted.
agreed_exact_count — gated behind with_agreement=true; always present on the detail view. Compares occurrence.determination_id (top human ID, already maintained) against the top machine Classification.taxon_id.
verified=true|false filter — EXISTS / strict complement, project-scoped, respects apply_default_filters.
verified_count added to ordering_fields.
Detail view (/taxa/<id>/) returns all four counts unconditionally.

The hierarchical descendant match reuses the parents_json containment pattern from the occurrence-list taxon=<id> filter, but the right-hand side is built from an OuterRef via jsonb_build_array(jsonb_build_object('id', <outer id>)) — a literal __contains lookup can't embed an OuterRef. Migration 0085 adds the supporting GIN index (parents_json jsonb_path_ops), created CONCURRENTLY / IF NOT EXISTS so it co-exists with the same index if it lands separately via the #1307 follow-up.

Frontend

Sortable Verified column on the taxa list (links to that taxon's verified occurrences), shown next to Occurrences.
Verification status filter pill (All / Verified / Unverified) → verified= query param (reuses the existing occurrence-list filter component).
Verification panel on the taxon detail page showing the counts.

Tests

ami.main.tests.TestTaxaVerification (9 tests): hierarchical rollup to ancestors, chosen-identification-only agreed_with_prediction, gated agreed_exact_count (absent on list unless with_agreement), verified filter + strict complement, ordering, and apply_defaults handling. Existing taxon suite (47 tests) green, including the list query-count audit (the new annotations are inline correlated subqueries, so the statement count doesn't change).

Performance — needs a decision before merge

Rough warm/cold timings for a 25-row page against a production DB copy, with the GIN index in place (numbers are measurements, the interpretation is mine):

Project size	default list	`verified=false`	`ordering=verified_count`	`with_agreement`
small (~25 occ)	~15ms	~6ms	~5ms	~40ms
mid (~2k occ)	~380ms	~430ms	~500ms	~630ms
large (~180k occ, ~11k verified)	~8.5s cold	~9s cold	>90s (timeout)	not measured

Small/mid meet the issue's targets. The largest project does not: the always-on count subqueries run ~8–9s cold, and ordering=verified_count falls off a cliff because it must compute the subquery for every taxon in the project before sorting, not just a page.

This looks structural rather than a missing index. Directions worth discussing (ordered by effort):

Gate the count annotations (compute only when explicitly requested / sorted on) and keep the default list on the cheap occurrences_count path.
Drop server-side sort on verified_count for large projects, or back it with a denormalized column (the issue already lists "backfill counts onto a denormalized Taxon field" as a follow-up).
Keep agreed_exact_count detail-only (already gated) and, if needed, add a /taxa/stats/verification/ aggregate endpoint (the Endpoint for stats about verified occurrences #1307 fallback the issue calls out).

I'd rather settle this than silently ship a list endpoint that times out on the biggest project. The correctness and UX are done; this is the open question.

Note on count semantics

occurrences_count stays direct-match (unchanged) while verified_count rolls up descendants, per the ticket. For species rows (the common case) they're equivalent; for Family/Order rows verified_count can exceed the direct occurrences_count. Flagging in case we want them consistent.

Test plan

Backend tests pass in CI.
Decide on the large-project performance approach above.
Browser check of the column / filter / detail panel (not yet done — the shared dev stack is bind-mounted to another branch's live test).

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

New Features
- Added verification metrics to species data, including verified occurrence counts and agreement statistics.
- Added sortable "Verified" column to the species table for quick visibility into verification status.
- Added "Verification" section to species detail view displaying verified counts, predicted agreements, and exact model matches.
- Added "Verification status" filter to species page, enabling filtering by verified or unverified species.

…lter Adds to GET /api/v2/taxa/ (issue #1316): - verified_count and agreed_with_prediction_count annotations (always on), rolled up over descendant occurrences via a hierarchical parents_json match. - agreed_exact_count, gated behind with_agreement=true (and always on the detail view). - verified=true|false filter (EXISTS / strict complement), project-scoped and respecting apply_default_filters. - verified_count added to ordering_fields. The hierarchical descendant match uses a Postgres jsonb @> containment built from an OuterRef (literal __contains can't embed an OuterRef). Migration 0085 adds the supporting GIN index on Taxon.parents_json (jsonb_path_ops). Frontend: sortable "Verified" column + "Verification status" filter on the taxa list, and a Verification panel on the taxon detail page. Co-Authored-By: Claude <noreply@anthropic.com>

netlify · 2026-05-22T04:46:49Z

✅ Deploy Preview for antenna-preview canceled.

Name	Link
🔨 Latest commit	`16b1468`
🔍 Latest deploy log	https://app.netlify.com/projects/antenna-preview/deploys/6a10b973b8aa0100082c9e6f

netlify · 2026-05-22T04:46:49Z

✅ Deploy Preview for antenna-ssec canceled.

Name	Link
🔨 Latest commit	`16b1468`
🔍 Latest deploy log	https://app.netlify.com/projects/antenna-ssec/deploys/6a10b973b71bc70007e9f915

coderabbitai · 2026-05-22T04:46:52Z

📝 Walkthrough

Walkthrough

This PR implements end-to-end per-taxon verification and agreement metrics, including a PostgreSQL GIN index for performance, ORM annotations for count computation, API serialization with conditional gating, and corresponding frontend table column, filter control, and detail view display.

Changes

Per-Taxon Verification and Agreement Metrics

Layer / File(s)	Summary
Database index and model interface `ami/main/migrations/0085_taxon_parents_json_gin_index.py`, `ami/main/models.py`	GIN index on `Taxon.parents_json` enables efficient hierarchical queries; three Taxon accessor methods define the annotation contract for `verified_count`, `agreed_with_prediction_count`, and `agreed_exact_count`.
ORM annotation and filtering `ami/main/api/views.py`	`add_verification_data()` computes per-taxon rollups for verified occurrences and agreement metrics (best identification vs. top machine prediction), conditionally adds `agreed_exact_count` annotation, and applies `verified=true/false` filtering for list requests.
API serialization and field gating `ami/main/api/serializers.py`	Both `TaxonListSerializer` and `TaxonSerializer` expose the three count fields; `TaxonListSerializer` conditionally removes `agreed_exact_count` unless `with_agreement=true` is passed as a query parameter.
Frontend data model and routing `ui/src/data-services/models/species.ts`, `ui/src/utils/getAppRoute.ts`	`Species` model exposes three count getters (`numVerified`, `numAgreedWithPrediction`, `numAgreedExact`); filter type system extended to accept `'verified'` key.
Frontend list view with column and filter `ui/src/pages/species/species-columns.tsx`, `ui/src/pages/species/species.tsx`	New sortable "Verified" column displays `numVerified` with link to filtered occurrences; enabled by default and paired with sidebar filter control for `verified` status.
Frontend detail page verification panel `ui/src/pages/species-details/species-details.tsx`	New verification section in details "Fields" tab shows verified count with link, agreed-with-prediction count, and conditional exact-match count.
Comprehensive test coverage `ami/main/tests.py`	`TestTaxaVerification` suite validates hierarchical count rollups, list-vs-detail field gating, filter correctness, ordering support, and interaction with `apply_defaults=false`.
Feature specification `docs/claude/planning/2026-05-20-taxa-verification-guidance-ticket.md`	Planning ticket documenting requirements, annotation semantics, API contracts for list and detail responses, database performance prerequisites, frontend UI changes, and test strategy.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Add guidance and stats about which taxa need verification #1316: Implements the same per-taxon verification and agreement annotation backend changes, API field exposure, GIN index, serialization with gating, and frontend UI components for the taxa endpoint.

Suggested reviewers

annavik

Poem

🐰 Counts hop through the schema, annotated with care,
GIN indexes dance as the queries go spare.
Verified or not, each taxon now gleams,
With agreement and prediction fulfilling our dreams! 🌿✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 22.22% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main changes: adding per-taxon verification and agreement counts plus a verified filter to the taxa endpoint.
Description check	✅ Passed	The description covers the template sections: Summary, List of Changes, Related Issues (via `#1316`), Detailed Description with backend/frontend/test details, and a Checklist. However, the Deployment Notes section is missing and several checklist items are unchecked.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch worktree-taxa-verification-counts

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

Adds per-taxon verification and model-agreement metrics to the taxa API and surfaces them in the taxa (“species”) UI, including a new verified filter and verified-count sorting/links.

Changes:

Backend: annotate taxa with verified_count + agreed_with_prediction_count, optionally agreed_exact_count, and add verified=true|false filtering; add GIN index migration for parents_json.
Frontend: add “Verified” column, verified filter control, and taxon detail “Verification” panel; propagate verified query param via routing utils.
Tests/docs: add API tests covering counts, gating, filter complement behavior, ordering, and apply-defaults behavior; add planning doc.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
ui/src/utils/getAppRoute.ts	Adds `verified` to supported query filter keys for route generation.
ui/src/pages/species/species.tsx	Enables verified filter control and default column visibility.
ui/src/pages/species/species-columns.tsx	Adds sortable “Verified” column linking to verified occurrences for a taxon.
ui/src/pages/species-details/species-details.tsx	Displays verification/agreement counts on the taxon detail panel with a link to verified occurrences.
ui/src/data-services/models/species.ts	Exposes new server fields via getters (`verified_count`, agreement counts).
docs/claude/planning/2026-05-20-taxa-verification-guidance-ticket.md	Planning/spec documentation for the feature and performance considerations.
ami/main/tests.py	Adds `TestTaxaVerification` coverage for annotations, gating, filtering, ordering, and apply-defaults.
ami/main/models.py	Adds Taxon stub methods for new annotated fields to support serialization.
ami/main/migrations/0085_taxon_parents_json_gin_index.py	Adds concurrent GIN index on `Taxon.parents_json` for hierarchical containment performance.
ami/main/api/views.py	Implements hierarchical “under taxon” occurrence correlation, annotations, `verified` filter, and ordering field addition.
ami/main/api/serializers.py	Gates `agreed_exact_count` field on list responses unless `with_agreement=true`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

        if not include_unobserved:
            # Efficient EXISTS check that uses the composite index
            qs = qs.filter(models.Exists(Occurrence.objects.filter(base_filter)))

+        qs = self.add_verification_data(qs, occurrence_filters, default_filters_q)
+


        "created_at",
        "updated_at",
        "occurrences_count",
+        "verified_count",


…bquery The per-taxon correlated parents_json subquery for verified_count / agreed_with_prediction_count / agreed_exact_count did not scale: on a large project (~1k taxa, ~17k occurrences) the taxa list timed out at the 30s statement limit even with the column hidden and on the default sort, because each page row (and the verified=false COUNT) ran a JSONB containment scan the GIN index can't serve when the @> right-hand side is an OuterRef. All three counts only concern verified occurrences (those with a non-withdrawn Identification), which are sparse. Compute the hierarchical rollup in a single pass over that small set in Python and apply it as constant-time CASE annotations; resolve the verified filter from the same precomputed set via id__in. Page values, sort, and the pagination COUNT are now constant-time. Also fixes ancestor rollup returning 0: parents_json round-trips through the pydantic schema field, so elements may be TaxonParent objects rather than dicts. Measured on the large project: default page 30s timeout -> ~0.6s; verified=false 30s timeout -> ~0.2s; ordering=verified_count 30s timeout -> ~0.04s. Co-Authored-By: Claude <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

ami/main/api/views.py (1)
1429-1433: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Gate verified_count ordering to project-scoped requests (or add a default annotation).

TaxonViewSet.ordering_fields advertises verified_count, but when project_id is not provided TaxonViewSet.get_queryset() does not call add_verification_data() and does not annotate verified_count (only occurrences_count/events_count are set to None). With NullsLastOrderingFilter, ordering by verified_count on the non-project path will attempt to order by a non-existent DB value, causing the queryset/ordering to fail. Restrict verified_count from ordering_fields when no active project is present, or annotate a default verified_count in that branch.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/main/api/views.py` around lines 1429 - 1433, TaxonViewSet advertises
ordering by "verified_count" but get_queryset() only annotates verified_count
via add_verification_data() when a project is provided, causing ordering to fail
on non-project requests; fix by either gating TaxonViewSet.ordering_fields to
remove "verified_count" when no project is active or by adding a default
annotation for verified_count in the non-project branch of
TaxonViewSet.get_queryset() (e.g., annotate verified_count=Value(0) or similar)
so NullsLastOrderingFilter always has a column to order by; update references to
add_verification_data(), TaxonViewSet.ordering_fields, and any filters using
NullsLastOrderingFilter accordingly.

🧹 Nitpick comments (5)

docs/claude/planning/2026-05-20-taxa-verification-guidance-ticket.md (5)
46-46: ⚡ Quick win

Clarify which four fields are included in detail view.

The text states "all four fields above" but only three new annotations are defined in this section: verified_count, agreed_with_prediction_count, and agreed_exact_count. The fourth field is ambiguous. If you're counting occurrences_count, that field pre-existed this feature and shouldn't be listed as part of "above."
✏️ Suggested clarification
-`GET /api/v2/taxa/<id>/` should include all four fields above unconditionally — single-row cost is negligible.
+`GET /api/v2/taxa/<id>/` should include all three verification fields above unconditionally (`verified_count`, `agreed_with_prediction_count`, `agreed_exact_count`) — single-row cost is negligible.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/claude/planning/2026-05-20-taxa-verification-guidance-ticket.md` at line
46, The doc currently says "all four fields above" but only three new
annotations are listed; update the GET /api/v2/taxa/<id>/ detail description to
explicitly enumerate the four fields returned: verified_count,
agreed_with_prediction_count, agreed_exact_count, and occurrences_count (noting
that occurrences_count is pre-existing), or remove "above" and state that these
four fields will be included unconditionally in the detail view to eliminate
ambiguity.
162-165: 💤 Low value

Consider removing specific line numbers from references.

Like the earlier reference on line 38, these hardcoded line numbers will become stale as the codebase evolves. For a planning document that may be referenced later, consider referencing by module and function/class name only, or note that line numbers are approximate as of a specific date.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/claude/planning/2026-05-20-taxa-verification-guidance-ticket.md` around
lines 162 - 165, The document currently pins specific line numbers for locations
like ami/main/api/views.py:1403 (TaxonViewSet), ami/main/api/views.py:1576
(get_taxa_observed), ami/main/models.py:2440 (Identification model /
agreed_with_prediction FK), and ami/main/models.py:3383
(update_occurrence_determination), which will go stale; remove the numeric line
references and instead cite only the module and symbol names (e.g.,
ami.main.api.views — TaxonViewSet; ami.main.api.views — get_taxa_observed;
ami.main.models — Identification; ami.main.models —
update_occurrence_determination) or add a short parenthetical like “line numbers
approximate as of 2026-05-20” if you want to keep a temporal anchor.
38-38: ⚡ Quick win

Avoid hardcoded line numbers in planning documentation.

The reference to ami/main/models.py:2528, 3383-3393 will become stale as the codebase evolves. Consider referencing the function name or module-level documentation instead, or note that line numbers are approximate as of the PR date.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/claude/planning/2026-05-20-taxa-verification-guidance-ticket.md` at line
38, Replace the hardcoded file line references with stable identifiers: remove
"ami/main/models.py:2528, 3383-3393" and instead reference the function and
class names (e.g. the update_occurrence_determination function and
Identification.save method in ami/main/models.py) or state that the line numbers
are approximate as of the PR date; this keeps the doc resilient to code reflows
while still directing readers to the correct implementation.
25-36: ⚡ Quick win

Update implementation approach description to match actual implementation.

The document describes the implementation using "EXISTS subquery" (line 25) and "correlated subqueries per row" (lines 31, 36), but the PR objectives indicate the final implementation switched to a "single-pass computation: precompute hierarchical rollup over verified occurrences in Python, apply results as CASE annotations and id__in for the filter" to resolve performance issues. This planning document should be updated to reflect the architecture that was actually shipped.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/claude/planning/2026-05-20-taxa-verification-guidance-ticket.md` around
lines 25 - 36, Update the text for TaxonViewSet's `verified=true|false` and the
three annotations (`verified_count`, `agreed_with_prediction_count`,
`agreed_exact_count`) to reflect the actual shipped architecture: replace
references to "EXISTS subquery" and "correlated subqueries per row" with a
description that we perform a single-pass computation that precomputes a
hierarchical rollup of verified occurrences in Python and then applies those
results back to the queryset as CASE annotations (for counts) and an id__in
filter (for `verified=true`), and note that `agreed_exact_count` is computed
only when `with_agreement=true` and therefore gated for performance; keep the
same behavioral semantics (descendant-inclusive via parents_json__contains or
determination_id) and preserve the existing notes about which fields are
always-on vs gated.
48-60: ⚡ Quick win

Resolve inconsistency about GIN index status.

Lines 50 mentions the GIN index is "already flagged as a follow-up to #1307" but line 57 states "Treat shipping the GIN index as a hard blocker." These statements conflict. Based on the PR objectives, the migration 0085 does include the GIN index, suggesting it was indeed treated as a blocker. Clarify the dependency status or update the language to reflect that the blocker was resolved.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/claude/planning/2026-05-20-taxa-verification-guidance-ticket.md` around
lines 48 - 60, The text is inconsistent about whether the GIN index on
Taxon.parents_json is still a blocker; update the doc to state the resolved
dependency by referencing the actual migration that applied it (migration 0085)
and the index name main_taxon_parents_json_gin_idx so readers know the GIN index
was shipped, and remove or reword the "hard blocker" assertion (or conversely,
if it truly remains unshipped, change the line that says it was "already
flagged" to indicate it is planned in `#1307`); ensure you mention
Taxon.parents_json and main_taxon_parents_json_gin_idx and either mark the
blocker as resolved (index applied in migration 0085) or clearly state it is
still required before enabling recursive rollup correctness.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ami/main/api/views.py`:
- Around line 1696-1701: The count loop is iterating SQL rows from
verified_occurrences (the Occurrence queryset built with variable
verified_occurrences and annotated with _agreed_prediction_id), but when
occurrence_filters includes detections__source_image__collections a single
Occurrence can join to multiple Detection rows, inflating counts; fix by
de-duplicating Occurrence rows before the Python rollup—call
verified_occurrences = verified_occurrences.distinct('pk') (or
.values_list('pk', flat=True).distinct() and re-query if you need full objects)
immediately after the annotated queryset is built and before the
values()/counting loop that computes verified_count,
agreed_with_prediction_count, and agreed_exact_count.

---

Outside diff comments:
In `@ami/main/api/views.py`:
- Around line 1429-1433: TaxonViewSet advertises ordering by "verified_count"
but get_queryset() only annotates verified_count via add_verification_data()
when a project is provided, causing ordering to fail on non-project requests;
fix by either gating TaxonViewSet.ordering_fields to remove "verified_count"
when no project is active or by adding a default annotation for verified_count
in the non-project branch of TaxonViewSet.get_queryset() (e.g., annotate
verified_count=Value(0) or similar) so NullsLastOrderingFilter always has a
column to order by; update references to add_verification_data(),
TaxonViewSet.ordering_fields, and any filters using NullsLastOrderingFilter
accordingly.

---

Nitpick comments:
In `@docs/claude/planning/2026-05-20-taxa-verification-guidance-ticket.md`:
- Line 46: The doc currently says "all four fields above" but only three new
annotations are listed; update the GET /api/v2/taxa/<id>/ detail description to
explicitly enumerate the four fields returned: verified_count,
agreed_with_prediction_count, agreed_exact_count, and occurrences_count (noting
that occurrences_count is pre-existing), or remove "above" and state that these
four fields will be included unconditionally in the detail view to eliminate
ambiguity.
- Around line 162-165: The document currently pins specific line numbers for
locations like ami/main/api/views.py:1403 (TaxonViewSet),
ami/main/api/views.py:1576 (get_taxa_observed), ami/main/models.py:2440
(Identification model / agreed_with_prediction FK), and ami/main/models.py:3383
(update_occurrence_determination), which will go stale; remove the numeric line
references and instead cite only the module and symbol names (e.g.,
ami.main.api.views — TaxonViewSet; ami.main.api.views — get_taxa_observed;
ami.main.models — Identification; ami.main.models —
update_occurrence_determination) or add a short parenthetical like “line numbers
approximate as of 2026-05-20” if you want to keep a temporal anchor.
- Line 38: Replace the hardcoded file line references with stable identifiers:
remove "ami/main/models.py:2528, 3383-3393" and instead reference the function
and class names (e.g. the update_occurrence_determination function and
Identification.save method in ami/main/models.py) or state that the line numbers
are approximate as of the PR date; this keeps the doc resilient to code reflows
while still directing readers to the correct implementation.
- Around line 25-36: Update the text for TaxonViewSet's `verified=true|false`
and the three annotations (`verified_count`, `agreed_with_prediction_count`,
`agreed_exact_count`) to reflect the actual shipped architecture: replace
references to "EXISTS subquery" and "correlated subqueries per row" with a
description that we perform a single-pass computation that precomputes a
hierarchical rollup of verified occurrences in Python and then applies those
results back to the queryset as CASE annotations (for counts) and an id__in
filter (for `verified=true`), and note that `agreed_exact_count` is computed
only when `with_agreement=true` and therefore gated for performance; keep the
same behavioral semantics (descendant-inclusive via parents_json__contains or
determination_id) and preserve the existing notes about which fields are
always-on vs gated.
- Around line 48-60: The text is inconsistent about whether the GIN index on
Taxon.parents_json is still a blocker; update the doc to state the resolved
dependency by referencing the actual migration that applied it (migration 0085)
and the index name main_taxon_parents_json_gin_idx so readers know the GIN index
was shipped, and remove or reword the "hard blocker" assertion (or conversely,
if it truly remains unshipped, change the line that says it was "already
flagged" to indicate it is planned in `#1307`); ensure you mention
Taxon.parents_json and main_taxon_parents_json_gin_idx and either mark the
blocker as resolved (index applied in migration 0085) or clearly state it is
still required before enabling recursive rollup correctness.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fae23575-10c9-4680-aad2-adee1ce29d2b

📥 Commits

Reviewing files that changed from the base of the PR and between e721c8b and 16b1468.

📒 Files selected for processing (11)

ami/main/api/serializers.py
ami/main/api/views.py
ami/main/migrations/0085_taxon_parents_json_gin_index.py
ami/main/models.py
ami/main/tests.py
docs/claude/planning/2026-05-20-taxa-verification-guidance-ticket.md
ui/src/data-services/models/species.ts
ui/src/pages/species-details/species-details.tsx
ui/src/pages/species/species-columns.tsx
ui/src/pages/species/species.tsx
ui/src/utils/getAppRoute.ts

coderabbitai · 2026-05-22T20:23:46Z

+        verified_occurrences = (
+            Occurrence.objects.filter(occurrence_filters)
+            .filter(default_filters_q)
+            .filter(models.Exists(Identification.objects.filter(occurrence=models.OuterRef("pk"), withdrawn=False)))
+            .annotate(_agreed_prediction_id=best_identification_agreed_prediction)
+        )


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # Locate the file explicitly (should exist) ls -la ami/main/api/views.py # Show the code around the reported snippet and the rollup loop python3 - <<'PY' import itertools,sys,os path="ami/main/api/views.py" start=1660 end=1765 with open(path,'r',encoding='utf-8') as f: lines=f.readlines() for i in range(start-1,end): if i<0 or i>=len(lines): continue print(f"{i+1:5d} {lines[i].rstrip()}") PY # Find where occurrence_filters is built (and whether it includes collection joins) rg -n "def get_occurrence_filters|occurrence_filters\s*=" ami/main/api/views.py # Print the get_occurrence_filters implementation if in this file rg -n "get_occurrence_filters" ami/main/api/views.py # Also search for the TaxonViewSet mention and how it filters by collections/detections__source_image__collections rg -n "TaxonViewSet|get_occurrence_filters|detections__source_image__collections" ami/main/api/views.py

Repository: RolnickLab/antenna

Length of output: 7497

🏁 Script executed:

#!/bin/bash set -euo pipefail path="ami/main/api/views.py" # print get_occurrence_filters implementation region around 1480-1565 python3 - <<'PY' path="ami/main/api/views.py" start=1460 end=1585 with open(path,'r',encoding='utf-8') as f: lines=f.readlines() for i in range(start-1,end): if 0<=i<len(lines): print(f"{i+1:5d} {lines[i].rstrip()}") PY # show add_verification_data call site around 1590-1665 to see parameters/conditions python3 - <<'PY' path="ami/main/api/views.py" start=1560 end=1675 with open(path,'r',encoding='utf-8') as f: lines=f.readlines() for i in range(start-1,end): if 0<=i<len(lines): print(f"{i+1:5d} {lines[i].rstrip()}") PY # extract BEST_MACHINE_PREDICTION_ORDER and value field meaning maybe rg -n "BEST_MACHINE_PREDICTION_ORDER" ami/main/api/views.py

Repository: RolnickLab/antenna

Length of output: 12014

🏁 Script executed:

#!/bin/bash set -euo pipefail path="ami/main/api/views.py" # print get_occurrence_filters implementation region around 1480-1565 python3 - <<'PY' path="ami/main/api/views.py" start=1460 end=1585 with open(path,'r',encoding='utf-8') as f: lines=f.readlines() for i in range(start-1,end): if 0<=i<len(lines): print(f"{i+1:5d} {lines[i].rstrip()}") PY # show add_verification_data call site around 1590-1665 to see parameters/conditions python3 - <<'PY' path="ami/main/api/views.py" start=1560 end=1675 with open(path,'r',encoding='utf-8') as f: lines=f.readlines() for i in range(start-1,end): if 0<=i<len(lines): print(f"{i+1:5d} {lines[i].rstrip()}") PY # extract BEST_MACHINE_PREDICTION_ORDER and value field meaning maybe rg -n "BEST_MACHINE_PREDICTION_ORDER" ami/main/api/views.py

Repository: RolnickLab/antenna

Length of output: 12014

🏁 Script executed:

#!/bin/bash set -euo pipefail # Search within views.py for determination__parents_json usage in add_verification_data context rg -n "add_verification_data|determination__parents_json|determination_id|verified_occurrences" ami/main/api/views.py # Search other files for occurrence verification rollup maybe rg -n "verified_occurrences|agreed_with_prediction_counts|agreed_exact_counts" -S ami/main/api/views.py ami/main/api -g'*.py' # Search where determination fields are annotated for Occurrence in this viewset rg -n "determination\s*=" ami/main/api/views.py # Show Occurrence model relations if present in repository to understand if occurrences join to determination, etc. fd "models.py" ami/main -a | head -n 20 rg -n "class Occurrence|determinatio[n_]" ami/main -g'*.py' | head -n 50

Repository: RolnickLab/antenna

Length of output: 8562

🏁 Script executed:

#!/bin/bash set -euo pipefail # Search within views.py for determination__parents_json usage in add_verification_data context rg -n "add_verification_data|determination__parents_json|determination_id|verified_occurrences" ami/main/api/views.py # Search other files for occurrence verification rollup maybe rg -n "verified_occurrences|agreed_with_prediction_counts|agreed_exact_counts" -S ami/main/api/views.py ami/main/api -g'*.py' # Search where determination fields are annotated for Occurrence in this viewset rg -n "determination\s*=" ami/main/api/views.py # Locate Occurrence model definition and its determination relation fd -t f "models.py" ami/main -x sh -c 'echo {}' | head -n 200 rg -n "class Occurrence|determination__parents_json|parents_json" ami/main -g'*.py' | head -n 120

Repository: RolnickLab/antenna

Length of output: 10895

🏁 Script executed:

#!/bin/bash set -euo pipefail python3 - <<'PY' import re path="ami/main/models.py" # find line numbers for relevant classes/fields with open(path,'r',encoding='utf-8') as f: lines=f.readlines() def find(pattern): for i,l in enumerate(lines,1): if re.search(pattern,l): return i return None targets=[ r"class Occurrence\b", r"class Detection\b", r"class Identification\b", r"\bdetections\b", r"occurrences\b", ] for t in targets: ln=find(t) print(t,ln) PY # Print the Occurrence model section (first ~200 lines after class definition) python3 - <<'PY' path="ami/main/models.py" start=3140 # approx from earlier rg end=3320 with open(path,'r',encoding='utf-8') as f: lines=f.readlines() for i in range(start-1,end): print(f"{i+1:5d} {lines[i].rstrip()}") PY # Print Detection and relation to occurrences (search for class Detection) python3 - <<'PY' import re path="ami/main/models.py" with open(path,'r',encoding='utf-8') as f: lines=f.readlines() start=None for i,l in enumerate(lines,1): if re.match(r"class Detection\b", l): start=i break print("Detection starts at", start) end=start+220 for i in range(start-1, end-1): print(f"{i+1:5d} {lines[i].rstrip()}") PY # Print Identification relation to occurrence if helpful python3 - <<'PY' import re path="ami/main/models.py" with open(path,'r',encoding='utf-8') as f: lines=f.readlines() start=None for i,l in enumerate(lines,1): if re.match(r"class Identification\b", l): start=i break print("Identification starts at", start) end=start+120 for i in range(start-1, end-1): print(f"{i+1:5d} {lines[i].rstrip()}") PY

Repository: RolnickLab/antenna

Length of output: 22136

🏁 Script executed:

#!/bin/bash set -euo pipefail # Search for verified_count usage in code/tests rg -n "verified_count|agreed_with_prediction_count|agreed_exact_count|with_agreement" ami/main/tests.py ami/main/api -g'*.py' # Search for query param "collection=" in tests around taxa verified counts rg -n "collection=" ami/main/tests.py -g'*.py' # Search for "OccurencesVerified" filter or endpoints rg -n "OccurenceVerified|OccurrenceVerified" ami/main/api/views.py ami/main/api -g'*.py' # If possible, locate serializer/response field mapping for these counts rg -n "agreed_exact_count|agreed_with_prediction_count" ami/main/api/serializers.py -g'*.py'

Repository: RolnickLab/antenna

Length of output: 5077

🏁 Script executed:

#!/bin/bash set -euo pipefail # Search for verified_count usage in code/tests rg -n "verified_count|agreed_with_prediction_count|agreed_exact_count|with_agreement" ami/main/tests.py ami/main/api -g'*.py' # Search for query param "collection=" in tests around taxa verified counts rg -n "collection=" ami/main/tests.py -g'*.py' # Search for "OccurencesVerified" filter or endpoints rg -n "OccurrenceVerified" ami/main/api/views.py ami/main/api -g'*.py' # Locate serializer response fields for these counts rg -n "agreed_exact_count|agreed_with_prediction_count|verified_count" ami/main/api/serializers.py -g'*.py'

Repository: RolnickLab/antenna

Length of output: 5077

De-duplicate Occurrence rows before counting in the Python rollup

In ami/main/api/views.py, when occurrence_filters includes the collection constraint (detections__source_image__collections), a single Occurrence can join to multiple matching Detection rows. The code at lines ~1696-1701 builds verified_occurrences from that joined queryset, and the loop at ~1717-1741 counts every SQL row from verified_occurrences.values(...), so verified_count, agreed_with_prediction_count, and agreed_exact_count become non–occurrence-weighted.

Suggested fix

- value_fields = ["determination_id", "determination__parents_json", "_agreed_prediction_id"] + value_fields = ["pk", "determination_id", "determination__parents_json", "_agreed_prediction_id"] if include_agreement: # Top machine prediction's taxon for the same occurrence. verified_occurrences = verified_occurrences.annotate( _best_machine_taxon_id=models.Subquery( Classification.objects.filter(detection__occurrence=models.OuterRef("pk")) .order_by(*BEST_MACHINE_PREDICTION_ORDER) .values("taxon_id")[:1] ) ) value_fields.append("_best_machine_taxon_id") verified_counts: dict[int, int] = {} agreed_with_prediction_counts: dict[int, int] = {} agreed_exact_counts: dict[int, int] = {} - for row in verified_occurrences.values(*value_fields): + for row in verified_occurrences.values(*value_fields).distinct():

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ami/main/api/views.py` around lines 1696 - 1701, The count loop is iterating SQL rows from verified_occurrences (the Occurrence queryset built with variable verified_occurrences and annotated with _agreed_prediction_id), but when occurrence_filters includes detections__source_image__collections a single Occurrence can join to multiple Detection rows, inflating counts; fix by de-duplicating Occurrence rows before the Python rollup—call verified_occurrences = verified_occurrences.distinct('pk') (or .values_list('pk', flat=True).distinct() and re-query if you need full objects) immediately after the annotated queryset is built and before the values()/counting loop that computes verified_count, agreed_with_prediction_count, and agreed_exact_count.

Copilot AI review requested due to automatic review settings May 22, 2026 04:46

Copilot started reviewing on behalf of mihow May 22, 2026 04:46 View session

Copilot AI reviewed May 22, 2026

View reviewed changes

coderabbitai Bot reviewed May 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(taxa): per-taxon verification & agreement counts + verified filter (#1316)#1317

feat(taxa): per-taxon verification & agreement counts + verified filter (#1316)#1317
mihow wants to merge 2 commits into
mainfrom
worktree-taxa-verification-counts

mihow commented May 22, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

netlify Bot commented May 22, 2026 •

edited

Loading

Uh oh!

netlify Bot commented May 22, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 22, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mihow commented May 22, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backend (GET /api/v2/taxa/)

Frontend

Tests

Performance — needs a decision before merge

Note on count semantics

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

netlify Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for antenna-preview canceled.

Uh oh!

netlify Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for antenna-ssec canceled.

Uh oh!

coderabbitai Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mihow commented May 22, 2026 •

edited by coderabbitai Bot

Loading

Backend (`GET /api/v2/taxa/`)

netlify Bot commented May 22, 2026 •

edited

Loading

netlify Bot commented May 22, 2026 •

edited

Loading

coderabbitai Bot commented May 22, 2026 •

edited

Loading