Clair3-RNA is a small variant caller for long-read RNA sequencing (lrRNA-seq) data. Clair3-RNA supports ONT R10.4.1 and R9.4.1 complementary DNA sequencing (cDNA) and direct RNA sequencing (dRNA). dRNA sequencing support the ONT latest SQK-RNA004 kit data for variant calling. Clair3-RNA also supports PacBio Sequel and PacBio MAS-Seq RNA sequencing data.
Allocate an interactive session and run the program.
Sample session (user input in bold):
[user@biowulf]$ sinteractive -c4
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job
[user@cn3144 ~]$ module load clair3-rna
[user@cn3144 ~]$ run_clair3_rna \
--bam_fn ${CLAIR3RNA_DATA}/ont/HG004_chr1_demo.bam \
--ref_fn ${CLAIR3RNA_DATA}/ont/GRCh38_no_alt_chr1.fa \
--output_dir out \
--threads ${SLURM_CPUS_PER_TASK} \
--platform "ont_guppy_drna002" \
--region chr1:816000-828000 \
--tag_variant_using_readiportal
[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$
Create a batch input file (e.g. clair3-rna.sh). For example:
#!/bin/bash
set -e
module load clair3-rna
run_clair3_rna \
--bam_fn ${CLAIR3RNA_DATA}/ont/HG004_chr1_demo.bam \
--ref_fn ${CLAIR3RNA_DATA}/ont/GRCh38_no_alt_chr1.fa \
--output_dir out \
--threads ${SLURM_CPUS_PER_TASK} \
--platform "ont_guppy_drna002" \
--region chr1:816000-828000 \
--tag_variant_using_readiportal
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=# [--mem=#] clair3-rna.sh
Create a swarmfile (e.g. clair3-rna.swarm). For example:
run_clair3_rna \
--bam_fn sample1.bam \
--ref_fn ${CLAIR3RNA_DATA}/ont/GRCh38_no_alt_chr1.fa \
--output_dir out1 \
--threads ${SLURM_CPUS_PER_TASK} \
--platform "ont_guppy_drna002" \
--region chr1:816000-828000 \
--tag_variant_using_readiportal
run_clair3_rna \
--bam_fn sample2.bam \
--ref_fn ${CLAIR3RNA_DATA}/ont/GRCh38_no_alt_chr1.fa \
--output_dir out2 \
--threads ${SLURM_CPUS_PER_TASK} \
--platform "ont_guppy_drna002" \
--region chr1:816000-828000 \
--tag_variant_using_readiportal
run_clair3_rna \
--bam_fn sample3.bam \
--ref_fn ${CLAIR3RNA_DATA}/ont/GRCh38_no_alt_chr1.fa \
--output_dir out3 \
--threads ${SLURM_CPUS_PER_TASK} \
--platform "ont_guppy_drna002" \
--region chr1:816000-828000 \
--tag_variant_using_readiportal
Submit this job using the swarm command.
swarm -f clair3-rna.swarm [-g #] -t # --module clair3-rnawhere
| -g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
| -t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
| --module clair3-rna | Loads the Clair3-RNA module for each subjob in the swarm |