Skip to content

relax gaf/paf filtering on first stage of mgSplit pipeline#1942

Merged
glennhickey merged 1 commit into
masterfrom
docker-fixes
May 25, 2026
Merged

relax gaf/paf filtering on first stage of mgSplit pipeline#1942
glennhickey merged 1 commit into
masterfrom
docker-fixes

Conversation

@glennhickey
Copy link
Copy Markdown
Collaborator

@Sagorikanag found a bug where chrY would get dropped when running cactus-pagnenome --mgSplit on a 3-genome graph. The root issue seems to be that the --mgSplit workflow first builds a reference only graph (just CHM13 in this case), then runs the usual splitting pipeline on that. But this also triggers a bunch of the gaf/paf filtering that normally happens in this stage. In particular, the paf overlap filter gobbles up the linear chrY alignment because of large ambiguous blocks.

This patch just turns off all these filters for the reference-only graph stages of --mgSplit.

By virtue of having 2 references (CHM13 and GRCh38), the v2.1 HPRC graphs seem to avoid this issue (and probably why I never ran into it). But I think it's a pretty serious once that could cause contigs to be dropped by --mgSplit with a single --reference in many cases...

@glennhickey glennhickey merged commit 37eef3b into master May 25, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant