Deepmod2 is a tool for finding DNA 5mC methylation from Oxford Nanopore reads. It can call methylation from POD5 and FAST5 files basecalled with either Guppy or Dorado. The output is a methylation tagged BAM file.
Allocate an interactive session and run the program. Sample session (based on Deepmod2's tutorial):
[user@biowulf]$ sinteractive --mem=15g --cpus-per-task=8 --gres=gpu:v100x:1
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job
[user@cn3144 ~]$ module load deepmod2 dorado samtools minimap2
[user@cn3144 ~]$ cd /data/${USER}
[user@cn3144 ~]$ INPUT_DIR=data
[user@cn3144 ~]$ OUT_DIR=mod
[user@cn3144 ~]$ mkdir -pv ${INPUT_DIR}/nanopore_raw_data
[user@cn3144 ~]$ tar xzf ${DEEPMOD2_TEST_DATA}/sample.pod5.tar.gz -C ${INPUT_DIR}/nanopore_raw_data
[user@cn3144 ~]$ dorado basecaller --emit-moves --recursive \
${DORADO_MODELS}/dna_r10.4.1_e8.2_400bps_hac@v4.3.0 \
${INPUT_DIR}/nanopore_raw_data > ${OUTPUT_DIR}/basecalled.bam
[2025-06-05 09:26:27.940] [info] Running: "basecaller" "--emit-moves" "--recursive" "/fdb/dorado/0.9.6/dna_r10.4.1_e8.2_400bps_hac@v4.3.0" "data/nanopore_raw_data"
[2025-06-05 09:26:28.099] [info] Normalised: overlap 500 -> 498
[2025-06-05 09:26:28.099] [info] Normalised: chunksize 10000 -> 9996
[2025-06-05 09:26:28.099] [info] > Creating basecall pipeline
[2025-06-05 09:26:29.108] [info] Calculating optimized batch size for GPU "Tesla V100-SXM2-32GB" and model dna_r10.4.1_e8.2_400bps_hac@v4.3.0. Full benchmarking will run for this device, which may take some time.
[2025-06-05 09:28:05.773] [info] cuda:0 using chunk size 9996, batch size 3328
[2025-06-05 09:28:06.912] [info] cuda:0 using chunk size 4998, batch size 6784
[2025-06-05 09:28:12.714] [info] > Finished in (ms): 3608
[2025-06-05 09:28:12.714] [info] > Simplex reads basecalled: 59
[2025-06-05 09:28:12.714] [info] > Basecalled @ Samples/s: 7.746490e+06
[2025-06-05 09:28:12.714] [info] > Finished
[user@cn3144 ~]$ samtools fastq ${OUTPUT_DIR}/basecalled.bam -T "*" | \
minimap2 -ax map-ont \
/fdb/igenomes/Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa - -y | \
samtools view -o ${OUTPUT_DIR}/aligned.bam
[M::mm_idx_gen::75.344*1.59] collected minimizers
[M::mm_idx_gen::94.013*1.86] sorted minimizers
[M::main::94.013*1.86] loaded/built the index for 195 target sequence(s)
[M::mm_mapopt_update::96.431*1.84] mid_occ = 694
[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 195
[M::mm_idx_stat::97.809*1.83] distinct minimizers: 100167746 (38.80% are singletons); average occurrences: 5.519; average spacing: 5.607; total length: 3099922541
[M::bam2fq_mainloop] discarded 0 singletons
[M::bam2fq_mainloop] processed 61 reads
[M::worker_pipeline::99.833*1.81] mapped 61 sequences
[M::main] Version: 2.29-r1283
[M::main] CMD: minimap2 -ax map-ont -y /fdb/igenomes/Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa -
[M::main] Real time: 100.028 sec; CPU: 180.762 sec; Peak RSS: 11.348 GB
[user@cn3144 ~]$ deepmod2 detect --seq_type dna --model bilstm_r10.4.1_5khz_v4.3 \
--file_type pod5 --bam mod/aligned.bam --input data/nanopore_raw_data \
--output mod/deepmod2/ \
--ref /fdb/igenomes/Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa \
--threads 8
2025-06-05 10:45:20.736429: Starting Per Read Methylation Detection.
2025-06-05 10:45:20.770338: Getting motif positions from the reference.
2025-06-05 10:48:28.819651: Finished getting motif positions from the reference.
2025-06-05 10:48:28.890963: Building BAM index.
2025-06-05 10:48:28.924808: Finished building BAM index.
2025-06-05 10:48:30.101261: Reading inputs complete.
2025-06-05 10:48:51.120458: Model predictions complete. Wrapping up output.
2025-06-05 10:48:51.376787: Number of reads processed: 57
2025-06-05 10:48:51.376847: Finished Per-Read Methylation Output. Starting Per-Site output.
2025-06-05 10:48:51.376857: Modification Tagged BAM file: mod/deepmod2/output.bam
2025-06-05 10:48:51.376873: Per Read Prediction file: mod/deepmod2/output.per_read
2025-06-05 10:48:51.376888: Writing Per Site Methylation Detection.
2025-06-05 10:48:51.413912: Finished Writing Per Site Methylation Output.
2025-06-05 10:48:51.413942: Per Site Prediction file: mod/deepmod2/output.per_site
2025-06-05 10:48:51.413951: Aggregated Per Site Prediction file: mod/deepmod2/output.per_site.aggregated
2025-06-05 10:48:53.018012: Time elapsed=213.6859s
[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$
Create a batch input file (e.g. deepmod2.sh). For example:
#!/bin/bash
#SBATCH --job-name=deepmod2
#SBATCH --gres=gpu:v100:1
#SBATCH --mem=16g
#SBATCH --cpus-per-task=8
#SBATCH --time=1:00:00
module load deepmod2 dorado samtools minimap2
cd /data/${USER}
INPUT_DIR=data
OUT_DIR=mod
mkdir -pv ${INPUT_DIR}/nanopore_raw_data
tar xzf ${DEEPMOD2_TEST_DATA}/sample.pod5.tar.gz -C ${INPUT_DIR}/nanopore_raw_data
dorado basecaller --emit-moves --recursive \
${DORADO_MODELS}/dna_r10.4.1_e8.2_400bps_hac@v4.3.0 \
${INPUT_DIR}/nanopore_raw_data > ${OUTPUT_DIR}/basecalled.bam
samtools fastq ${OUTPUT_DIR}/basecalled.bam -T "*" | \
minimap2 -ax map-ont \
/fdb/igenomes/Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa - -y | \
samtools view -o ${OUTPUT_DIR}/aligned.bam
deepmod2 detect --seq_type dna --model bilstm_r10.4.1_5khz_v4.3 \
--file_type pod5 --bam mod/aligned.bam --input data/nanopore_raw_data \
--output mod/deepmod2/ \
--ref /fdb/igenomes/Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa \
--threads 8
Submit this job using the Slurm sbatch command.
sbatch deepmod2.sh
Create a swarmfile (e.g. deepmod2.swarm). For example:
deepmod2 detect --seq_type dna --model bilstm_r10.4.1_5khz_v4.3 \
--file_type pod5 --bam mod/aligned_01.bam --input data/nanopore_raw_data \
--output mod/deepmod2/ \
--ref /fdb/igenomes/Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa \
--threads 8
deepmod2 detect --seq_type dna --model bilstm_r10.4.1_5khz_v4.3 \
--file_type pod5 --bam mod/aligned_02.bam --input data/nanopore_raw_data \
--output mod/deepmod2/ \
--ref /fdb/igenomes/Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa \
--threads 8
deepmod2 detect --seq_type dna --model bilstm_r10.4.1_5khz_v4.3 \
--file_type pod5 --bam mod/aligned_03.bam --input data/nanopore_raw_data \
--output mod/deepmod2/ \
--ref /fdb/igenomes/Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa \
--threads 8
Submit this job using the swarm command.
swarm -f deepmod2.swarm [-g #] [-t #] --module deepmod2where
| -g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
| -t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
| --module deepmod2 | Loads the deepmod2 module for each subjob in the swarm |