MATS is a computational tool to detect differential alternative splicing events from RNA-Seq data. The statistical model of MATS calculates the P-value and false discovery rate that the difference in the isoform ratio of a gene between two conditions exceeds a given user-defined threshold. The replicate MATS (rMATS) is designed for detection of differential alternative splicing from replicate RNA-Seq data.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive -c 16 --mem 45g --gres=lscratch:20 salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ mkdir -p /data/$USER/rmats && cd /data/$USER/rmatsHere is how one can use the most recent rMATS version 4.3.0:
[user@cn3144 ~]$ module load rmats/4.3.0 [+] Loading singularity 3.8.5-1 on cn3144 [+] Loading rMATS 4.3.0 [user@cn3144 ~]$ cp -r $RMATS_SRC/* . [user@cn3144 ~]$ make ... -- Configuring done -- Generating done -- Build files have been written to: /usr/local/apps/rMATS/4.3.0/rmats-turbo-4.3.0/bamtools/build make[2]: Entering directory '/vf/users/denisovga/rMATS/bamtools/build' make[3]: Entering directory '/vf/users/denisovga/rMATS/bamtools/build' make[3]: Leaving directory '/vf/users/denisovga/rMATS/bamtools/build' make[3]: Entering directory '/vf/users/denisovga/rMATS/bamtools/build' make[3]: Leaving directory '/vf/users/denisovga/rMATS/bamtools/build' [ 0%] Built target SharedHeaders make[3]: Entering directory '/vf/users/denisovga/rMATS/bamtools/build' Consolidate compiler generated dependencies of target BamTools-static make[3]: Leaving directory '/vf/users/denisovga/rMATS/bamtools/build' make[3]: Entering directory '/vf/users/denisovga/rMATS/bamtools/build' [ 1%] Building CXX object src/api/CMakeFiles/BamTools-static.dir/BamAlignment.cpp.o [ 2%] Building CXX object src/api/CMakeFiles/BamTools-static.dir/BamMultiReader.cpp.o [ 3%] Building CXX object src/api/CMakeFiles/BamTools-static.dir/BamReader.cpp.o [ 4%] Building CXX object src/api/CMakeFiles/BamTools-static.dir/BamWriter.cpp.o ... [ 95%] Building CXX object src/toolkit/CMakeFiles/bamtools_cmd.dir/bamtools_revert.cpp.o [ 96%] Building CXX object src/toolkit/CMakeFiles/bamtools_cmd.dir/bamtools_sort.cpp.o [ 97%] Building CXX object src/toolkit/CMakeFiles/bamtools_cmd.dir/bamtools_split.cpp.o [ 98%] Building CXX object src/toolkit/CMakeFiles/bamtools_cmd.dir/bamtools_stats.cpp.o [ 99%] Building CXX object src/toolkit/CMakeFiles/bamtools_cmd.dir/bamtools.cpp.o make[3]: Leaving directory '/vf/users/denisovga/rMATS/bamtools/build' [100%] Built target bamtools_cmd make[2]: Leaving directory '/vf/users/denisovga/rMATS/bamtools/build' make[1]: Leaving directory '/vf/users/denisovga/rMATS/bamtools/build' # rm -f to ignore nonexistent files since *.dylib will only exist for mac cd bamtools/lib; rm -f *.so *.so.* *.dylib cd rMATS_C; make; make[1]: Entering directory '/vf/users/denisovga/rMATS/rMATS_C' cd lbfgs_scipy && make make[2]: Entering directory '/vf/users/denisovga/rMATS/rMATS_C/lbfgs_scipy' f77 -c -O2 -c -o lbfgsb.o lbfgsb.f f77 -c -O2 -c -o linpack.o linpack.f f77 -c -O2 -c -o timer.o timer.f make[2]: Leaving directory '/vf/users/denisovga/rMATS/rMATS_C/lbfgs_scipy' cc -Wall -O2 -msse2 -funroll-loops -fopenmp -fPIE -o rMATSexe src/main.c src/myfunc.c src/util.c lbfgs_scipy/lbfgsb.o lbfgs_scipy/linpack.o lbfgs_scipy/timer.o -lm -lgfortran -lgsl -lgslcblas -lgomp -fPIE -L/usr/lib/x86_64-linux-gnu -lblas -llapack make clean make[2]: Entering directory '/vf/users/denisovga/rMATS/rMATS_C' rm -rf util.o myfunc.o *.o src/*.o lbfgs_scipy/*.o make[2]: Leaving directory '/vf/users/denisovga/rMATS/rMATS_C' make[1]: Leaving directory '/vf/users/denisovga/rMATS/rMATS_C' cd rMATS_pipeline; python setup.py build_ext; Compiling rmatspipeline/rmatspipeline.pyx because it changed. [1/1] Cythonizing rmatspipeline/rmatspipeline.pyx running build_ext building 'rmats.rmatspipeline' extension gcc -pthread -B /opt/conda/envs/rmats/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/opt/conda/envs/rmats/rmats-turbo-4.3.0/rMATS_pipeline/rmatspipeline -I/vf/users/denisovga/rMATS/bamtools/include -I/opt/conda/envs/rmats/include/python3.6m -c rmatspipeline/rmatspipeline.cpp -o build/temp.linux-x86_64-3.6/rmatspipeline/rmatspipeline.o -O3 -funroll-loops -std=c++11 -fopenmp -D__STDC_CONSTANT_MACROS -D__STDC_LIMIT_MACROS -w -Wl,-static cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ g++ -pthread -shared -B /opt/conda/envs/rmats/compiler_compat -L/opt/conda/envs/rmats/lib -Wl,-rpath=/opt/conda/envs/rmats/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.6/rmatspipeline/rmatspipeline.o -L/vf/users/denisovga/rMATS/bamtools/lib -lm -lstdc++ -lbamtools -lz -o build/lib.linux-x86_64-3.6/rmats/rmatspipeline.cpython-36m-x86_64-linux-gnu.so -lbamtools -lm -std=c++11 -lz -fopenmp cp `find ./rMATS_pipeline/build | grep so` .; [user@cn3144 ~]$ ./test_rmats test (tests.allow_clipping.test.ClippingAllowedTest) ... ok test (tests.allow_clipping.test.ClippingNotAllowedTest) ... ok test (tests.alternative_3_splice_site_novel.test.NovelJunction) ... ok test (tests.alternative_3_splice_site_novel.test.NovelSpliceSite) ... ok test (tests.alternative_5_splice_site_novel.test.NovelJunction) ... ok test (tests.alternative_5_splice_site_novel.test.NovelSpliceSite) ... ok test (tests.fixed_event_set.test.Test) ... ok test (tests.individual_counts.test.FilteredTest) ... ok test (tests.individual_counts.test.NotFilteredTest) ... ok test (tests.mutually_exclusive_exons_novel.test.NovelJunction) ... ok test (tests.mutually_exclusive_exons_novel.test.NovelSpliceSite) ... ok test (tests.only_one_sample.test.StatOffTest) ... ok test (tests.only_one_sample.test.StatOnTest) ... ok test (tests.overlapped_gene.test.Test) ... ok test (tests.prep_post.test.Test) ... ok test (tests.read_count_edge_cases.test.Test) ... ok test (tests.retained_intron_novel.test.NovelJunction) ... ok test (tests.retained_intron_novel.test.NovelSpliceSite) ... ok test (tests.skipped_exon_basic.test.Test) ... ok test (tests.skipped_exon_novel.test.NovelJunction) ... ok test (tests.skipped_exon_novel.test.NovelSpliceSite) ... ok test (tests.skipped_exon_novel.test.NovelSpliceSiteNotNovelJunction) ... ok test (tests.stat_large_file.test.Test) ... ok test (tests.stranded.test.PairedFirstStrandTest) ... ok test (tests.stranded.test.PairedSecondStrandTest) ... ok test (tests.stranded.test.PairedUnstrandedTest) ... ok test (tests.stranded.test.SingleEndFirstStrandTest) ... ok test (tests.stranded.test.SingleEndSecondStrandTest) ... ok test (tests.stranded.test.SingleEndUnstrandedTest) ... ok test (tests.task_stat.test.Test) ... ok test (tests.variable_read_length.test.Length1Test) ... ok test (tests.variable_read_length.test.Length1VariableTest) ... ok test (tests.variable_read_length.test.Length2Test) ... ok test (tests.variable_read_length.test.Length2VariableTest) ... ok test (tests.variable_read_length.test.NoLengthTest) ... ok ---------------------------------------------------------------------- Ran 35 tests in 114.831s OK [user@cn3144 ~]$ cp -r $RMATS_DATA/* . [user@cn3144 ~]$ rmats.py --s1 $PWD/s1.txt \ --s2 $PWD/s2.txt \ --gtf gtf/Homo_sapiens.Ensembl.GRCh37.72.gtf \ --bi /fdb/STAR_indices/2.7.8a/GENCODE/Gencode_human/release_27/genes-100 \ --od out_test \ -t paired \ --nthread $SLURM_CPUS_PER_TASK \ --readLength 50 \ --tophatAnchor 8 \ --cstat 0.0001 \ --tstat 6 \ --statoff \ --tmp /lscratch/${SLURM_JOB_ID} ...
[user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch input file (e.g. submit.sh). For example:
#!/bin/bash set -e module load rmats export TMPDIR=/lscratch/$SLURM_JOBID rmats.py --s1 $PWD/s1.txt --s2 $PWD/s2.txt --gtf $PWD/gtf/Homo_sapiens.Ensembl.GRCh37.72.gtf --bi /fdb/STAR_indices/2.7.8a/GENCODE/Gencode_human/release_27/genes-100 --od out_test -t paired --nthread $SLURM_CPUS_PER_TASK --readLength 50 --tophatAnchor 8 --cstat 0.0001 --tstat 6 --tmp /lscratch/${SLURM_JOB_ID}
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=16 --mem=30g --gres=lscratch:20 submit.sh
Create a swarmfile (e.g. rmats.swarm). For example:
rmats.py --s1 $PWD/s1.txt --s2 $PWD/s2.txt --gtf $PWDgtf/Homo_sapiens.Ensembl.GRCh37.72.gtf --bi /fdb/STAR_indices/2.7.8a/GENCODE/Gencode_human/release_27/genes-100 --od out_test1 -t paired --nthread $SLURM_CPUS_PER_TASK --readLength 50 --tophatAnchor 8 --cstat 0.0001 --tstat 6 --tmp /lscratch/${SLURM_JOB_ID} rmats.py --s1 $PWD s3.txt --s2 $PWD s4.txt --gtf $PWDgtf/Homo_sapiens.Ensembl.GRCh37.72.gtf --bi /fdb/STAR_indices/2.7.8a/GENCODE/Gencode_human/release_27/genes-100 --od out_test2 -t paired --nthread $SLURM_CPUS_PER_TASK --readLength 50 --tophatAnchor 8 --cstat 0.0001 --tstat 6 --tmp /lscratch/${SLURM_JOB_ID} rmats.py --s1 $PWD s5.txt --s2 $PWD s6.txt --gtf $PWDgtf/Homo_sapiens.Ensembl.GRCh37.72.gtf --bi /fdb/STAR_indices/2.7.8a/GENCODE/Gencode_human/release_27/genes-100 --od out_test3 -t paired --nthread $SLURM_CPUS_PER_TASK --readLength 50 --tophatAnchor 8 --cstat 0.0001 --tstat 6 --tmp /lscratch/${SLURM_JOB_ID}
Submit this job using the swarm command.
swarm -f rmats.swarm [-g 30] [-t 16] --module rmatswhere
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
-t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
--module TEMPLATE | Loads the TEMPLATE module for each subjob in the swarm |