You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
AliNe is a pipeline written in Nextflow that aims to efficiently align reads against a reference using the tools of your choice.
9
9
10
+
Input: file, list of file, folder or csv
11
+
Output: Coordinate sorted BAM file.
12
+
10
13
## Table of Contents
11
14
12
15
*[Foreword](#foreword)
@@ -45,7 +48,7 @@ You can choose to run one or several aligner in parallel.
45
48
46
49
| Tool | Single End (short reads) | Paired end (short reads) | Pacbio | ONT |
47
50
| --- | --- | --- | --- | --- |
48
-
| bbmap | ✅ | ✅ |⚠️ | ⚠️|
51
+
| bbmap | ✅ | ✅ |✅ use mapPacBio.sh | ✅ use mapPacBio.sh|
49
52
| bowtie | ✅ | ✅ | ⚠️ | ⚠️ |
50
53
| bowtie2 | ✅ | ✅ | ⚠️ | ⚠️ |
51
54
| bwaaln | ✅ | ✅ R1 and R2 independently aligned then merged with bwa sampe | ⚠️ | ⚠️ |
@@ -56,7 +59,7 @@ You can choose to run one or several aligner in parallel.
56
59
| hisat2 | ✅ | ✅ | ⚠️ | ⚠️ |
57
60
| kallisto | ✅ | ✅ | ⚠️ | ⚠️ |
58
61
| last | ⚠️ | ⚠️ R1 and R2 independently aligned then merged with maf-convert | ✅ | ✅ |
59
-
| minimap2 |⚠️|⚠️| ✅ | ✅ |
62
+
| minimap2 |✅|✅| ✅ | ✅ |
60
63
| ngmlr | ⚠️ | ⚠️ R1 and R2 independently aligned then merged with cat | ✅ | ✅ |
61
64
| novoalign | ✅ | ✅ | ✅ | ⚠️ |
62
65
| nucmer | ✅ | ✅ R1 and R2 independently aligned then merged with cat | ⚠️ | ⚠️ |
@@ -322,21 +325,26 @@ On success you should get a message looking like this:
322
325
--help prints the help section
323
326
324
327
General Parameters
325
-
--reads path to the reads folder or (remote) file (commma separated list of remote file accepted).
326
-
If a folder is provided, all the files with proper extension are detected.
328
+
--reads path to the reads file, folder or csv. If a folder is provided, all the files with proper extension in the folder will be used. You can provide remote files (commma separated list).
327
329
file extension expected :<.fastq.gz>, <.fq.gz>, <.fastq> or <.fq>
328
-
for paired reads extra <_R1_001> or <_R2_001> is expected where <R> and <_001> are optional. e.g. <sample_id_1.fastq.gz>, <sample_id_R1.fastq.gz>, <sample_id_R1_001.fastq.gz>)
330
+
for paired reads extra <_R1_001> or <_R2_001> is expected where <R> and <_001> are optional. e.g. <sample_id_1.fastq.gz>, <sample_id_R1.fastq.gz>, <sample_id_R1_001.fastq.gz>)
fastq_2 is optional and can be empty. Strandedness, read_type and data_type expect same values as corresponding AliNe parameters; If a value is provided via AliNe parameter, it will override the value in the csv file.
--reference path to the reference file (fa, fa.gz, fasta or fasta.gz)
330
338
--aligner aligner(s) to use among this list (comma or space separated) [bbmap, bowtie, bowtie2, bwaaln, bwamem, bwamem2, bwasw, graphmap2, hisat2, kallisto, minimap2, novoalign, nucmer, ngmlr, star, subread, sublong]
331
339
--outdir path to the output directory (default: alignment_results)
332
340
--annotation [Optional][used by graphmap2, STAR, subread] Absolute path to the annotation file (gtf or gff3)
333
341
334
342
Type of input reads
343
+
--data_type type of data among this list [DNA, RNA] (no default)
335
344
--read_type type of reads among this list [short_paired, short_single, pacbio, ont] (default: short_paired)
336
-
--library_type Set the library_type of your reads (default: auto). In auto mode salmon will guess the library typefor each sample.
345
+
--strandedness Set the library_type of your reads (default: auto). In auto mode salmon will guess the library typefor each sample.
337
346
If you know the library type you can set it to one of the following: [U, IU, MU, OU, ISF, ISR, MSF, MSR, OSF, OSR]. See https://salmon.readthedocs.io/en/latest/library_type.html for more information.
338
347
In such case the sample library type will be used for all the samples.
339
-
--skip_libray_usage Skip the usage of library type provided by the user or guessed by salmon.
340
348
341
349
Extra steps
342
350
--trimming_fastp run fastp for trimming (default: false)
@@ -392,8 +400,11 @@ Here the description of typical ouput you will get from AliNe:
392
400
├── mean_read_length # Folder with mean read length computed in bash (optional - done if selected aligners need the info and no value provided by the user)
393
401
│ └── sample1_seqkit_trim_sampled_read_length.txt # Mean read length for sample1
394
402
│
395
-
├── salmon_libtype # Librairy information (read orientation and strand information) detected via Salmon
396
-
│ └── sample1_lib_format_counts.json # Librairy information detectected for sample1
403
+
├── salmon_libtype # Library information (read orientation and strand information) detected via Salmon
404
+
│ └── sample1_lib_format_counts.json # Library information detectected for sample1
405
+
|
406
+
├── aline_updated_params
407
+
| └── sample1.txt # File resuming the parameters automatically set by AliNe
397
408
|
398
409
├── alignment # Folder gathering all alignment output (indicies, sorted bam and logs)
399
410
│ ├── aligner1 # Folder gathering data produced by aligner
params.debug && log.info('library type alignment')
750
759
// ------------------- BBMAP -----------------
751
760
if ("bbmap"in aligner_list ){
@@ -1260,9 +1269,15 @@ def helpMSG() {
1260
1269
--help prints the help section
1261
1270
1262
1271
General Parameters
1263
-
--reads path to the reads fileor folder. If a folder is provided, all the files with proper extension in the folder will be used. You can provide remote files (commma separated list).
1272
+
--reads path to the reads file, folder or csv. If a folder is provided, all the files with proper extension in the folder will be used. You can provide remote files (commma separated list).
1264
1273
file extension expected : <.fastq.gz>, <.fq.gz>, <.fastq> or <.fq>
1265
-
for paired reads extra <_R1_001> or <_R2_001> is expected where <R> and <_001> are optional. e.g. <sample_id_1.fastq.gz>, <sample_id_R1.fastq.gz>, <sample_id_R1_001.fastq.gz>)
1274
+
for paired reads extra <_R1_001> or <_R2_001> is expected where <R> and <_001> are optional. e.g. <sample_id_1.fastq.gz>, <sample_id_R1.fastq.gz>, <sample_id_R1_001.fastq.gz>)
fastq_2 is optional and can be empty. Strandedness, read_type and data_type expect same values as corresponding AliNe parameters; If a value is provided via AliNe paramter, it will override the value in the csv file.
0 commit comments