AscatNGS contains the Cancer Genome Projects workflow implementation of the ASCAT copy number algorithm for paired end sequencing.
Allocate an interactive session and run the program. Sample session:
[user@biowulf ~]$ sinteractive -c4 --mem=8g --gres=lscratch:10
salloc.exe: Pending job allocation 11188180
salloc.exe: job 11188180 queued and waiting for resources
salloc.exe: job 11188180 has been allocated resources
salloc.exe: Granted job allocation 11188180
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn0880 are ready for job
srun: error: x11: no local DISPLAY defined, skipping
error: unable to open file /tmp/slurm-spank-x11.11188180.0
slurmstepd: error: x11: unable to read DISPLAY value
[user@cn0880 ~]$ module load ascatngs
[+] Loading ascatngs 4.5.0 on cn0880
[+] Loading singularity 3.7.2 on cn0880
[user@cn0880 ~]$ ascat.pl
ERROR: Option must be defined.
Usage:
ascat.pl [options]
Please define as many of the parameters as possible
Required parameters
-outdir -o Folder to output result to.
-tumour -t Tumour BAM/CRAM/counts file (counts must be .gz)
-normal -n Normal BAM/CRAM/counts file (counts must be .gz)
-reference -r Reference fasta
-snp_gc -sg Snp GC correction file
-protocol -pr Sequencing protocol (e.g. WGS, WXS)
-gender -g Sample gender (XX, XY, L, FILE)
For XX/XY see '-gc'
When 'L' see '-l'
FILE - matched normal is_male.txt from ascatCounts.pl
Targeted processing (further detail under OPTIONS):
-process -p Only process this step then exit, optionally set -index
-index -i Optionally restrict '-p' to single job
-limit -x Specifying 2 will balance processing between '-i 1 & 2'
Must be paired with '-p allele_count'
Optional parameters
-genderChr -gc Specify the 'Male' sex chromosome: Y,chrY...
-species -rs Reference species [BAM HEADER]
-assembly -ra Reference assembly [BAM HEADER]
-platform -pl Seqeuncing platform [BAM HEADER]
-minbasequal -q Minimum base quality required before allele is used. [20]
-cpus -c Number of cores to use. [1]
- recommend max 2 during 'input' process.
-locus -l Using a list of loci, default when '-L' [share/gender/GRCh37d5_Y.loci]
- these are loci that will not be present at all in a female sample
-force -f Force completion - solution not possible
- adding this will result in successful completion of analysis even
when ASCAT can't generate a solution. A default copynumber of 5/2
(tumour/normal) and contamination of 30% will be set along with a
comment in '*.samplestatistics.csv' to indicate this has occurred.
-purity -pu Purity (rho) setting for manual setting of sunrise plot location
-ploidy -pi Ploidy (psi) setting for manual setting of sunrise plot location
-noclean -nc Finalise results but don't clean up the tmp directory.
- Useful when including a manual check and restarting ascat with new pu and pi params.
-nobigwig -nb Don't generate BigWig files.
-t_name -tn Tumour name to use when using count files as input
-n_name -nn Noraml name to use when using count files as input
Other
-help -h Brief help message
-man -m Full documentation.
-version -v Ascat version number
[user@cn0880 ~]$ cd /lscratch/$SLURM_JOB_ID
[user@cn0880 11188180]$ cp /path/to/tumor.bam .
[user@cn0880 11188180]$ cp /path/to/normal.bam .
[user@cn0880 11188180]$ ascat.pl -o output -t tumor.bam -n normal.bam \
-r /fdb/igenomes/Homo_sapiens/Ensembl/GRCh37/Sequence/WholeGenomeFasta/genome.fa \
-snp_gc SnpGcCorrections.tsv -pr wgs -g XX -c $SLURM_CPUS_PER_TASK
[snip...]
[user@cn0880 11188180]$ exit
exit
srun: error: cn0880: task 0: Exited with exit code 2
salloc.exe: Relinquishing job allocation 11188180
[user@biowulf ~]$
Note: You need to generate the SnpGcCorrections.tsv (as can be found in: https://github.com/cancerit/ascatNgs/wiki/Convert-SnpPositions.tsv-to-SnpGcCorrections.tsv) or downloaded (https://github.com/cancerit/ascatNgs/wiki/Human-reference-files-from-1000-genomes-VCFs). Generates LogR.txt and BAF.txt, which can be used to generate non-segmented plots (see https://www.crick.ac.uk/peter-van-loo/software/ASCAT)
Create a batch input file (e.g. AscatNGS.sh). For example:
#!/bin/bash module load ascatngs export GENOME=/fdb/igenomes/Homo_sapiens/Ensembl/GRCh37/Sequence/WholeGenomeFasta/ ascat.pl -o output -t tumor.bam -n normal.bam -r $GENOME/genome.fa -snp_gc SnpGcCorrections.tsv -pr wgs -g XX -c $SLURM_CPUS_PER_TASK
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=16 --mem=30g AscatNGS.sh
Create a swarmfile (e.g. AscatNGS.swarm). For example:
ascat.pl -o output1 -t tumor1.bam -n normal1.bam ... -c $SLURM_CPUS_PER_TASK ascat.pl -o output2 -t tumor2.bam -n normal2.bam ... -c $SLURM_CPUS_PER_TASK ascat.pl -o output3 -t tumor3.bam -n normal3.bam ... -c $SLURM_CPUS_PER_TASK
Submit this job using the swarm command.
swarm -f AscatNGS.swarm -g 30 -t 16 --module ascatngswhere
| -g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
| -t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
| --module ascatngs | Loads the Ascat NGS module for each subjob in the swarm |