Skip to content

VCF coordinates appear to be 0-based instead of 1-based #13

@djhshih

Description

@djhshih

The VCF output file appears to contain 0-based coordinates.
As per the specification, VCF files should contain 1-based coordinates (https://samtools.github.io/hts-specs/VCFv4.1.pdf).

(FYI: VCF also specify that "telomeres are indicated by using positions 0 or N+1, where N is the length of the corresponding chromosome or contig.")

In the below example, the event in the VCF file should have position 16376517 instead of 16376516.

breakseq.vcf.gz:

#CHROM  POS ID  REF ALT QUAL  FILTER  INFO  FORMAT  PB0418-N
1 16376516  . C <DEL> . LowQual SVLEN=-9741;SVTYPE=DEL;END=16386257 GT:ABC:PE:REFCOUNTS 1/1:0,0,1:1:0,0

breakseq.out:

1 1KG_Phase1  Deletion  16376517  16386257  1 . C NAME C1WLGACXX130310:1:1301:1852:84776$1;MAPQ 37;CIGAR 76M;POS 61;END 137;LIBLEN 200;LSPAN 39;RSPAN 37;PE Y;SVLEN 9741

breakseq_genotyped.gff:

1 BreakSeq  Deletion  16376517  16386257  1 . . QUAL LowQual;ABC 0,0,1;PE 1;SVLEN 9741;GT 1/1;COUNTS 0,0

The breakpoint example above is annotated in the library as:

$ grep 163765 breakseq2_bplib_20150129.gff
1       1KG_Phase1      Deletion        16376517        16386257        .       .       .

I am assuming that the breakpoint GFF library contains 1-based coordinate, as per the specification (http://useast.ensembl.org/info/website/upload/gff.html).

Invocation:

run_breakseq2.py \
        --reference /seq/references/Homo_sapiens_assembly19/v1/Homo_sapiens_assembly19.fasta \
        --work work \
        --bwa bwa-0.7.12/bwa \
        --samtools samtools-0.1.19/samtools \
        --bplib_gff bplib/breakseq2_bplib_20150129.gff \
        --nthreads 4 \
        --sample PB0418-N \
        --bams /seq/picard_aggregation/C932/PB0418-N/v7/PB0418-N.bam \

The run log reports success on the last line:

INFO 2016-08-17 10:31:56,875 breakseq2_workflow   Done! (342.241 s)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions