Skip to content

UMI Consensus read implementation#164

Closed
freerkvandijk wants to merge 25 commits intodevfrom
UMI_consensus
Closed

UMI Consensus read implementation#164
freerkvandijk wants to merge 25 commits intodevfrom
UMI_consensus

Conversation

@freerkvandijk
Copy link
Copy Markdown

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@tomiles tomiles changed the title Initial commit UMI consensus read implementation Mar 17, 2026
@freerkvandijk freerkvandijk changed the title UMI consensus read implementation UMI Consensus initial commit Mar 17, 2026
@tomiles tomiles changed the base branch from main to dev March 17, 2026 10:45
@tomiles tomiles changed the title UMI Consensus initial commit UMI Consensus read implementation Mar 17, 2026
@tomiles tomiles added the enhancement New feature or request label Mar 17, 2026
Copy link
Copy Markdown

@tomiles tomiles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, [modules/local/umi_consensus/main.nf‎ is a mess. Almost everything there has a matching nf-core module, so we should switch to those. Local models should be limited to preprocessing workflow specific implementation that we can’t be found/pushed upstream in nf-core modules.

@tomiles
Copy link
Copy Markdown

tomiles commented Mar 17, 2026

Don't forget to remove the unnecessary tests for the now deleted local modules.

include { SAMTOOLS_CONVERT as SAMTOOLS_CONVERT_UMI } from "../../../modules/nf-core/samtools/convert/main"
include { SAMTOOLS_SORMADUP } from "../../../modules/nf-core/samtools/sormadup/main.nf"
include { SAMTOOLS_SORT } from "../../../modules/nf-core/samtools/sort/main"
include { UMI_CONSENSUS_KAPA } from '../../local/umi_consensus/main'
Copy link
Copy Markdown

@tomiles tomiles Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be that you forgotten git add the now separated UMI_CONSENSUS_KAPA subworkflow file?

docs/README.md Outdated

- [Usage](usage.md)
- An overview of how the pipeline works, how to run it and a description of all of the different command-line flags.
- [Technical map](technical_map.md)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this change originate from? Since there’s no technical_map.md introduced in this PR, is this a merge/rebase conflict resolution artifact?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test samplesheet should go in the tests/inputs/test.yml Amend test samplesheet and retain per sample configurability

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point here?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean up unused tests

}

params {
disable_pipeline_plugins = true
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Y THO?


// Local lightweight filtering step used before module-based UMI processing.
// Keeps primary mapped reads and removes unmapped/secondary/supplementary records.
process UMI_LOCAL_SAMTOOLS_VIEW {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this to be an extra process. You can just alias the nf-core samtools_view module and configure in modules.config

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integrate relevant (global) info in existing markdown files, amend the metro map and then remove this file

docs/usage.md Outdated
When `umi_consensus` is enabled, the `UMI_CONSENSUS_KAPA` workflow runs multiple `samtools`/`fgbio` processes plus remapping through `FASTQ_ALIGN_DNA`.
With container-based execution (`docker`, `singularity`, etc.), default per-process containers are already configured.

Example custom config snippet:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop this snippet

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete this file

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete this file

}
.set { ch_umi_bam_bai_fasta_fai }

SAMTOOLS_CONVERT_UMI(ch_umi_bam_bai_fasta_fai)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for extra import, just mix the output bams in the bam channel and pipe to the regular samtools view

.set { ch_cram_crai }
ch_cram_crai.dump(tag: "FASTQ_TO_CRAM: cram and crai", pretty: true)

ch_sormadup_metrics = ch_sormadup_metrics.mix(UMI_CONSENSUS_KAPA.out.family_sizes)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this?


main:
// 1) Initial mapping of raw reads with the configured aligner/index.
FASTQ_ALIGN_DNA(ch_meta_reads_aligner_index_fasta, false)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be handled by the parent workflow, use pre-aligned bam as input

}

// 3) Prepare read-pair metadata and UMI tags before consensus calling.
UMI_SAMTOOLS_VIEW(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point here?

// 3) Prepare read-pair metadata and UMI tags before consensus calling.
UMI_SAMTOOLS_VIEW(
FASTQ_ALIGN_DNA.out.bam
.map { meta, bam -> [meta, bam, file('/etc/hosts')] },
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?????

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this and add to test-datasets if needed

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of this?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the copy?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do this?

-o ${prefix}.bam \\
-t ${task.cpus} \\
snap-aligner ${subcmd} \
\$INDEX \
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break the module

SAMTOOLS_STATS(ch_bam_bai_fasta_fai)
SAMTOOLS_FLAGSTAT(ch_bam_bai_fasta_fai)
SAMTOOLS_IDXSTATS(ch_bam_bai_fasta_fai)
SAMTOOLS_FLAGSTAT(ch_bam_bai)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the change?

// FUNCTIONS
include { getGenomeAttribute } from '../../local/utils_nfcore_preprocessing_pipeline'

workflow FASTQ_ALIGN_DNA {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 workflow per file please

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants