Skip to content

Peak Calling with macs2 #307

@plbngl

Description

@plbngl

Hello,
thanks again for the nice tool!
I have been using your function for pseudobulk and macs2 peak calling and while it seems to work fine, after inspection of peak calls I found them to be weird in some cases, especially at highly transcribed genes, where large peaks spanning the entire genes are found, rather than defined peaks. I am attaching example plots where I compare 3 tracks

  1. Default parameters of the call_peaks_macs() function: --format BED --call-summits --keep-dup all --shift -75 --extsize 150 --nomodel --nolambda
  2. Remove duplicated in the call_peaks_macs() function: --format BED --call-summits --keep-dup 1 --shift -75 --extsize 150 --nomodel --nolambda
  3. Default peak calling with Signac CallPeaks() function: --format BED --keep-dup 1 --nomodel --extsize 200 --shift -100
  4. Peak calling with Signac CallPeaks() function using parameters to match bpcell default: --format BED --call-summits --keep-dup all --shift -75 --extsize 150 --nomodel --nolambda

The peak tracks are from Monocyte CD14 cells and I am showing an example of highly transcribed gene (HLA-DRA) and a gene that is not expressed (but that show accessible sites nevertheless and have same calls in all 4 versions)

Image Image

Here I am also attaching the number of peaks and size distribution
Image

I know that there are are a lot of different elements that go into the peak calling such as the input BED file (signac uses the fragment file coordinates), the shift/extend option, BED or BEDPE, keep dup 1 or keep dup all, so I am starting to get confused on which is the best combination of parameters to get peaks that are of reasonable size and reflecting the celltype of origin. Number 3 looks the best however I think I am missing a lot of signal.
Perhaps you guys that have thought a lot about it have a solution? Or at least an explanation regarding the method you have implemented and your best practice suggestion.
Thanks a lot!
Paola

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions