SalmonTE is an ultra-Fast and Scalable Quantification Pipeline of Transpose Element (TE) Abundances. It is based on snakemake, salmon and R. Note that SalmonTE packages its own version of salmon.
SalmonTE.py quant is multithreaded. Please match the number of
threads to the number of allocated CPUs$SALMONTE_TEST_DATAAllocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=6g --cpus-per-task=4 --gres=lscratch:10
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job
[user@cn3144]$ cd /lscratch/$SLURM_JOB_ID
[user@cn3144]$ module load salmonte
[user@cn3144]$ cp -r ${SALMONTE_TEST_DATA:-none}/data .
[user@cn3144]$ ls -lh data
total 5.0M
-rw-rw-r-- 1 user group 634K Nov 11 10:13 CTRL_1_R1.fastq
-rw-rw-r-- 1 user group 634K Nov 11 10:13 CTRL_1_R2.fastq
-rw-rw-r-- 1 user group 634K Nov 11 10:13 CTRL_2_R1.fastq
-rw-rw-r-- 1 user group 634K Nov 11 10:13 CTRL_2_R2.fastq
-rw-rw-r-- 1 user group 634K Nov 11 10:13 TARDBP_1_R1.fastq
-rw-rw-r-- 1 user group 634K Nov 11 10:13 TARDBP_1_R2.fastq
-rw-rw-r-- 1 user group 634K Nov 11 10:13 TARDBP_2_R1.fastq
-rw-rw-r-- 1 user group 634K Nov 11 10:13 TARDBP_2_R2.fastq
[user@cn3144]$ SalmonTE.py quant --reference=hs --outpath=quant_out \
--num_threads=$SLURM_CPUS_PER_TASK --exprtype=count data
2019-11-11 10:19:30,550 Starting quantification mode
2019-11-11 10:19:30,550 Collecting FASTQ files...
2019-11-11 10:19:30,553 The input dataset is considered as a paired-ends dataset.
2019-11-11 10:19:30,553 Collected 4 FASTQ files.
2019-11-11 10:19:30,553 Quantification has been finished.
2019-11-11 10:19:30,553 Running Salmon using Snakemake
...
[user@cn3144]$ ls -lh quant_out
total 68K
-rw-rw-r-- 1 user group 23K Nov 11 10:19 clades.csv
-rw-rw-r-- 1 user group 63 Nov 11 10:19 condition.csv
drwxrwxr-x 5 user group 4.0K Nov 11 10:19 CTRL_1
drwxrwxr-x 5 user group 4.0K Nov 11 10:19 CTRL_2
-rw-rw-r-- 1 user group 17K Nov 11 10:19 EXPR.csv
-rw-rw-r-- 1 user group 161 Nov 11 10:19 MAPPING_INFO.csv
drwxrwxr-x 5 user group 4.0K Nov 11 10:19 TARDBP_1
drwxrwxr-x 5 user group 4.0K Nov 11 10:19 TARDBP_2
Notes:
Before running the differential expression test, it is necessary to update the file quant_out/condition.csv to include your experimental conditions.
[user@cn3144]$ mv quant_out/condition.csv quant_out/condition.csv.orig
[user@cn3144]$ cat <<EOF > quant_out/condition.csv
SampleID,condition
TARDBP_1,treatment
CTRL_1,control
TARDBP_2,treatment
CTRL_2,control
EOF
[user@cn3144]$ ### or just edit the condition.csv file with your favorite text editor
[user@cn3144]$ SalmonTE.py test --inpath=quant_out --outpath=test_out \
--tabletype=csv --figtype=png --analysis_type=DE \
--conditions=control,treatment
[user@cn3144]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf]$
Create a batch input file (e.g. salmonte.sh), which uses the input file 'salmonte.in'. For example:
#!/bin/bash
module load salmonte/0.4 || exit 1
SalmonTE.py quant --reference=hs --outpath=all_quant_out \
--num_threads=$SLURM_CPUS_PER_TASK --exprtype=count data
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=8 --mem=10g salmonte.sh