Dorado is a basecaller for Oxford Nanopore reads.
The latest dorado version with support for Fast5 files and DNA R10.4.1 4kHz data, DNA R9.4.1, and RNA002 basecalling models is v0.9.6. See the v1.0 release notes.
Allocate an interactive session and run the program.
Sample session (user input in bold):
[user@biowulf]$ sinteractive --gres=gpu:v100x:1,lscratch:200 --mem=16g --cpus-per-task=6 salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load dorado [user@cn3144 ~]$ cd /lscratch/$SLURM_JOB_ID [user@cn3144 ~]$ cp -rL "${DORADO_TEST_DATA:-none}" input [user@cn3144 ~]$ ls -lh input -rw-r--r--. 1 user group 20G Jun 2 17:21 reads.pod5 [user@cn3144 ~]$ # emits unaligned bam by default [user@cn3144 ~]$ dorado basecaller --device cuda:all ${DORADO_MODELS}/dna_r10.4.1_e8.2_400bps_hac@v4.3.0 ${DORADO_TEST_DATA}/r10.4.1.pod5 > output.bam [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Dorado scales well to 4 v100X GPUs. For a100 GPUs 3 or fewer GPUs are ideal. Please keep in mind that jobs allocating multiple GPUs may be queued for a longer time waiting for resources.
Runtime [min] | |||
---|---|---|---|
V100x GPUs | 0.7.3 | 0.8.1 | Efficiency |
1 | 90 | 90 | 100% |
2 | 45 | 44 | 100% |
3 | 30 | 30 | 100% |
4 | 23 | 23 | 100% |
a100 GPUs | 0.7.3 | 0.8.1 | Efficiency |
1 | 24 | 24 | 100% |
2 | 12 | 12 | 100% |
3 | 9 | 9 | 88% |
4 | 7 | 7 | 85% |
Create a batch input file (e.g. dorado.sh). For example:
#!/bin/bash set -e module load dorado cd /lscratch/$SLURM_JOB_ID dorado basecaller --device cuda:all ${DORADO_MODELS}/dna_r10.4.1_e8.2_400bps_hac@v4.3.0 ${DORADO_TEST_DATA}/r10.4.1.pod5 > output.bam
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=6 --mem=16g --gres=lscratch:50,gpu:v100x:1 --partition=gpu dorado.sh