Helper scripts for bcbio-nextgen
Requirements: Git, Conda
To install:
git clone git@github.com:bcbio/bcbio-utils.git
conda env create --name bcbio-utils --file bcbio-utils/environment.ymlIf you would like to use CWL scripts you will need to install cwltool separately:
conda activate bcbio-utils
pip install cwltoolScripts:
analyze_complexity_by_starts.py- create reads sequenced vs unique start sites graph for examining the quality of a libraryanalyze_quality_recal.py- provide plots summarizing recalibration of quality scoresbam_to_fastq_region.py- prepare paired end fastq files from a chromosome region in an aligned input BAM filebam_to_wiggle.py- convert BAM files to BigWig file format in a specified regionbcbio_prep_cwl_genomes.py- clean and prepare a set of genomes for CWL usage and uploadbroad_redo_analysis.py- redo post-processing of Broad alignments with updated pipelinebuild_compare_vcf.py- build a test comparison dataset from an existing VCF filebuild_gatk_jar.sh- build a GATK jar without embedded dependencies from current gitcg_svevents_to_vcf.py- convert Complete Genomics SvEvents file of structural variants to VCFcollect_metrics_to_csv.py- collect alignment summary metrics from multiple lanes and summarize as CSVconvert_samplesheet_config.py- convert Illumina SampleSheet CSV files to the run_info.yaml input filefind_clonal_svs.py- find 10x structural variants present uniquely in parent or clonesformat_dream_truthset.py- format DREAM challenge truth sets to contain BED files of covered regions and SVsgb2genome.py- convert genbank to gtfhla_loh_comparison.py- run LOH heterogeneity comparison amongst multiple methods, focusing on HLAhlas_to_pgroups.py- collapse HLAs present in hg38 1000 genomes distribution to p-groupshydra_to_vcf.py- convert Hydra BEDPE output into VCF 4.1 formatmonthly_billing_report.py- retrieve from Galaxy a high level summary report of sequencing done in a monthplink_to_vcf.py- convert Plink ped/map files into VCF format using plink and Plink/SEQrename_samples.py- rename sample name in a BAM file, eliminating spaces and colon charactersresort_bam_karyotype.py- resort a BAM file karyotypically to match GATK's preferred file orderrtg_to_callable.py- convert RTG coverage statistics into a BED file of callable regionssort_gatk_intervals.py- sort GATK interval lists based on a sequence dictionarysummarize_gemini_tstv.py- provide table summarizing Transition/Transversion ratios for variantssummarize_priority_variants.py- summarize priority calls in annotated structural variantssummarize_timing.py- convert time stamps from bcbio logs into hourly timings per steptcga_to_bcbio.py- handle pairing primary and metastasized tumors with blood or solid normalstest_resources.py- ?upload_to_synapse.py- upload bcbio reference materials and inputs to a Synapse project