NEW: add report to extract-reads enabling interrogation of why reads were/were not extracted#222
Open
gregcaporaso wants to merge 5 commits intoqiime2:devfrom
Open
NEW: add report to extract-reads enabling interrogation of why reads were/were not extracted#222gregcaporaso wants to merge 5 commits intoqiime2:devfrom
gregcaporaso wants to merge 5 commits intoqiime2:devfrom
Conversation
c59f58f to
3d88165
Compare
Adds a second output `read_extraction_stats` (ImmutableMetadata) to the `extract_reads` action. The new artifact is a per-input-sequence TSV reporting: extraction outcome, match orientation and method (exact vs approximate), forward/reverse primer binding positions in the matched-orientation sequence, per-primer match percentages, amplicon length before and after trimming, and input sequence length. Adds two unit tests asserting that _gen_reads returns the same amplicon when given a sequence and its reverse complement (read_orientation='both'), covering both the exact-match and approximate-match code paths and asserting the expected match-method and match-orientation stats. Updated internal helpers: `_align_primer` now returns a named tuple with primer binding coordinates; `_exact_match` and `_approx_match` return richer result tuples. `_gen_reads` always returns a (amplicon, stats) pair. Note from @gregcaporaso: Claude was used in the development of the code and the tests. I did a full review of the code and tests before passing this along for code review by another human. Claude's tests were a bit underwhelming - that is where I made most of my modifications. For example, all values in test_extract_reads_stats_all_extracted were hand-calculated by me (Claude previously had very general tests of the output stats). This is my first use of Claude for new functionality going into a QIIME 2 plugin. The new output table has already been very helpful for me in a few exploratory analyses that I've run, and I next hope to use this to validate that some code simplifications that I want to make in extract_reads don't impact the results in ways that I don't want them to (this was very hard to assess in my previous recent work on this action). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3d88165 to
86fb4a2
Compare
colinvwood
approved these changes
May 5, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replacing #221.