Deepmod2 is a tool for finding DNA 5mC methylation from Oxford Nanopore reads. It can call methylation from POD5 and FAST5 files basecalled with either Guppy or Dorado. The output is a methylation tagged BAM file.
Allocate an interactive session and run the program. Sample session (based on Deepmod2's tutorial):
[user@biowulf]$ sinteractive --mem=15g --cpus-per-task=8 --gres=gpu:v100x:1 salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load deepmod2 dorado samtools minimap2 [user@cn3144 ~]$ cd /data/${USER} [user@cn3144 ~]$ INPUT_DIR=data [user@cn3144 ~]$ OUT_DIR=mod [user@cn3144 ~]$ mkdir -pv ${INPUT_DIR}/nanopore_raw_data [user@cn3144 ~]$ tar xzf ${DEEPMOD2_TEST_DATA}/sample.pod5.tar.gz -C ${INPUT_DIR}/nanopore_raw_data [user@cn3144 ~]$ dorado basecaller --emit-moves --recursive \ ${DORADO_MODELS}/dna_r10.4.1_e8.2_400bps_hac@v4.3.0 \ ${INPUT_DIR}/nanopore_raw_data > ${OUTPUT_DIR}/basecalled.bam [2025-06-05 09:26:27.940] [info] Running: "basecaller" "--emit-moves" "--recursive" "/fdb/dorado/0.9.6/dna_r10.4.1_e8.2_400bps_hac@v4.3.0" "data/nanopore_raw_data" [2025-06-05 09:26:28.099] [info] Normalised: overlap 500 -> 498 [2025-06-05 09:26:28.099] [info] Normalised: chunksize 10000 -> 9996 [2025-06-05 09:26:28.099] [info] > Creating basecall pipeline [2025-06-05 09:26:29.108] [info] Calculating optimized batch size for GPU "Tesla V100-SXM2-32GB" and model dna_r10.4.1_e8.2_400bps_hac@v4.3.0. Full benchmarking will run for this device, which may take some time. [2025-06-05 09:28:05.773] [info] cuda:0 using chunk size 9996, batch size 3328 [2025-06-05 09:28:06.912] [info] cuda:0 using chunk size 4998, batch size 6784 [2025-06-05 09:28:12.714] [info] > Finished in (ms): 3608 [2025-06-05 09:28:12.714] [info] > Simplex reads basecalled: 59 [2025-06-05 09:28:12.714] [info] > Basecalled @ Samples/s: 7.746490e+06 [2025-06-05 09:28:12.714] [info] > Finished [user@cn3144 ~]$ samtools fastq ${OUTPUT_DIR}/basecalled.bam -T "*" | \ minimap2 -ax map-ont \ /fdb/igenomes/Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa - -y | \ samtools view -o ${OUTPUT_DIR}/aligned.bam [M::mm_idx_gen::75.344*1.59] collected minimizers [M::mm_idx_gen::94.013*1.86] sorted minimizers [M::main::94.013*1.86] loaded/built the index for 195 target sequence(s) [M::mm_mapopt_update::96.431*1.84] mid_occ = 694 [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 195 [M::mm_idx_stat::97.809*1.83] distinct minimizers: 100167746 (38.80% are singletons); average occurrences: 5.519; average spacing: 5.607; total length: 3099922541 [M::bam2fq_mainloop] discarded 0 singletons [M::bam2fq_mainloop] processed 61 reads [M::worker_pipeline::99.833*1.81] mapped 61 sequences [M::main] Version: 2.29-r1283 [M::main] CMD: minimap2 -ax map-ont -y /fdb/igenomes/Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa - [M::main] Real time: 100.028 sec; CPU: 180.762 sec; Peak RSS: 11.348 GB [user@cn3144 ~]$ deepmod2 detect --seq_type dna --model bilstm_r10.4.1_5khz_v4.3 \ --file_type pod5 --bam mod/aligned.bam --input data/nanopore_raw_data \ --output mod/deepmod2/ \ --ref /fdb/igenomes/Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa \ --threads 8 2025-06-05 10:45:20.736429: Starting Per Read Methylation Detection. 2025-06-05 10:45:20.770338: Getting motif positions from the reference. 2025-06-05 10:48:28.819651: Finished getting motif positions from the reference. 2025-06-05 10:48:28.890963: Building BAM index. 2025-06-05 10:48:28.924808: Finished building BAM index. 2025-06-05 10:48:30.101261: Reading inputs complete. 2025-06-05 10:48:51.120458: Model predictions complete. Wrapping up output. 2025-06-05 10:48:51.376787: Number of reads processed: 57 2025-06-05 10:48:51.376847: Finished Per-Read Methylation Output. Starting Per-Site output. 2025-06-05 10:48:51.376857: Modification Tagged BAM file: mod/deepmod2/output.bam 2025-06-05 10:48:51.376873: Per Read Prediction file: mod/deepmod2/output.per_read 2025-06-05 10:48:51.376888: Writing Per Site Methylation Detection. 2025-06-05 10:48:51.413912: Finished Writing Per Site Methylation Output. 2025-06-05 10:48:51.413942: Per Site Prediction file: mod/deepmod2/output.per_site 2025-06-05 10:48:51.413951: Aggregated Per Site Prediction file: mod/deepmod2/output.per_site.aggregated 2025-06-05 10:48:53.018012: Time elapsed=213.6859s [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch input file (e.g. deepmod2.sh). For example:
#!/bin/bash #SBATCH --job-name=deepmod2 #SBATCH --gres=gpu:v100:1 #SBATCH --mem=16g #SBATCH --cpus-per-task=8 #SBATCH --time=1:00:00 module load deepmod2 dorado samtools minimap2 cd /data/${USER} INPUT_DIR=data OUT_DIR=mod mkdir -pv ${INPUT_DIR}/nanopore_raw_data tar xzf ${DEEPMOD2_TEST_DATA}/sample.pod5.tar.gz -C ${INPUT_DIR}/nanopore_raw_data dorado basecaller --emit-moves --recursive \ ${DORADO_MODELS}/dna_r10.4.1_e8.2_400bps_hac@v4.3.0 \ ${INPUT_DIR}/nanopore_raw_data > ${OUTPUT_DIR}/basecalled.bam samtools fastq ${OUTPUT_DIR}/basecalled.bam -T "*" | \ minimap2 -ax map-ont \ /fdb/igenomes/Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa - -y | \ samtools view -o ${OUTPUT_DIR}/aligned.bam deepmod2 detect --seq_type dna --model bilstm_r10.4.1_5khz_v4.3 \ --file_type pod5 --bam mod/aligned.bam --input data/nanopore_raw_data \ --output mod/deepmod2/ \ --ref /fdb/igenomes/Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa \ --threads 8
Submit this job using the Slurm sbatch command.
sbatch deepmod2.sh
Create a swarmfile (e.g. deepmod2.swarm). For example:
deepmod2 detect --seq_type dna --model bilstm_r10.4.1_5khz_v4.3 \ --file_type pod5 --bam mod/aligned_01.bam --input data/nanopore_raw_data \ --output mod/deepmod2/ \ --ref /fdb/igenomes/Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa \ --threads 8 deepmod2 detect --seq_type dna --model bilstm_r10.4.1_5khz_v4.3 \ --file_type pod5 --bam mod/aligned_02.bam --input data/nanopore_raw_data \ --output mod/deepmod2/ \ --ref /fdb/igenomes/Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa \ --threads 8 deepmod2 detect --seq_type dna --model bilstm_r10.4.1_5khz_v4.3 \ --file_type pod5 --bam mod/aligned_03.bam --input data/nanopore_raw_data \ --output mod/deepmod2/ \ --ref /fdb/igenomes/Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa \ --threads 8
Submit this job using the swarm command.
swarm -f deepmod2.swarm [-g #] [-t #] --module deepmod2where
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
-t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
--module deepmod2 | Loads the deepmod2 module for each subjob in the swarm |