Camus on Biowulf

Camus: Fitting and denovo imputation of cancer mutational signature

References:

Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program. Sample session (based on camus's user manual):

[user@biowulf]$ sinteractive --mem=15g --cpus-per-task=8
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job
[user@cn3144 ~]$ module load camus

[user@cn3144 ~]$ cd /data/${USER}
[user@cn3144 ~]$ module load camus
[user@cn3144 ~]$ cp -r ${CAMUS_TEST_DATA}/* .

[user@cn3144 ~]$ # Prepare reference genome:
[user@cn3144 ~]$ camus fasta2nib -reference /fdb/igenomes/Homo_sapiens/UCSC/hg19/hg19.fa
Splitting reference genome in to separate chromosome files...
FastA to nib file conversion...

[user@cn3144 ~]$ # Produce a matrix of SNV counts:
[user@cn3144 ~]$ camus matgen -list vcf_files/list -norm no -number 16 \
                 -samples vcf_files/samples -genome_bin chr_nib/ -matrix mymatrix.txt
Processing Sample:vcf_files/T01.vcf
Processing Sample:vcf_files/T02.vcf
Processing Sample:vcf_files/T03.vcf
Processing Sample:vcf_files/T04.vcf
Processing Sample:vcf_files/T05.vcf
Processing Sample:vcf_files/T06.vcf
Processing Sample:vcf_files/T07.vcf
Processing Sample:vcf_files/T08.vcf
Processing Sample:vcf_files/T09.vcf
Processing Sample:vcf_files/T10.vcf
Processing Sample:vcf_files/T11.vcf
Processing Sample:vcf_files/T12.vcf
Processing Sample:vcf_files/T13.vcf
Processing Sample:vcf_files/T14.vcf
Processing Sample:vcf_files/T15.vcf
Processing Sample:vcf_files/T16.vcf
Counting total number of SNVs in single tumors...
T01.vcf 32441
T02.vcf 34806
T03.vcf 63070
T04.vcf 12501
T05.vcf 30799
T06.vcf 99274
T07.vcf 62692
T08.vcf 60410
T09.vcf 35067
T10.vcf 95105
T11.vcf 41282
T12.vcf 32887
T13.vcf 39714
T14.vcf 35899
T15.vcf 31069
T16.vcf 26625
Writing data matrix to a file
Finished writing the data matrix.


[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$

Batch job
Most jobs should be run as batch jobs.

Create a batch input file (e.g. camus.sh). For example:

#!/bin/bash
#SBATCH --job-name=camus
#SBATCH --mem=16g
#SBATCH --cpus-per-task=8
#SBATCH --time=1:00:00

module load camus

cd /data/${USER}
cp -r ${CAMUS_TEST_DATA}/* .
camus fasta2nib -reference /fdb/igenomes/Homo_sapiens/UCSC/hg19/hg19.fa
camus matgen -list vcf_files/list -norm no -number 16 \
                 -samples vcf_files/samples -genome_bin chr_nib/ -matrix mymatrix.txt

Submit this job using the Slurm sbatch command.

sbatch camus.sh
Swarm of Jobs
A swarm of jobs is an easy way to submit a set of independent commands requiring identical resources.

Create a swarmfile (e.g. camus.swarm). For example:

camus matgen -list vcf_files1/list -norm no -number 16 \
                 -samples vcf_files1/samples -genome_bin chr_nib/ -matrix OUTPUT1/mymatrix.txt
camus matgen -list vcf_files2/list -norm no -number 16 \
                 -samples vcf_files2/samples -genome_bin chr_nib/ -matrix OUTPUT2/mymatrix.txt
camus matgen -list vcf_files3/list -norm no -number 16 \
                 -samples vcf_files3/samples -genome_bin chr_nib/ -matrix OUTPUT3/mymatrix.txt

Submit this job using the swarm command.

swarm -f camus.swarm [-g #] [-t #] --module camus
where
-g # Number of Gigabytes of memory required for each process (1 line in the swarm command file)
-t # Number of threads/CPUs required for each process (1 line in the swarm command file).
--module camus Loads the camus module for each subjob in the swarm