Pomoxis contains convenience wrappers around nanopore tools.
$POMOXIS_TEST_DATAAllocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --gres=lscratch:50 --cpus-per-task=6 --mem=12g salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144]$ cd /lscratch/$SLURM_JOB_ID [user@cn3144]$ ml pomoxis/0.3.15 [+] Loading singularity 4.0.1 on cn4299 [+] Loading pomoxis 0.3.15 [user@cn3144]$ cp -L $POMOXIS_TEST_DATA/* . [user@cn3144]$ ls -lh total 906M -rw-r----- 1 user staff 4.5M Sep 12 09:12 NC_000913.3.fasta -rw-r----- 1 user staff 901M Sep 12 09:12 R9_Ecoli_K12_MG1655_lambda_MinKNOW_0.51.1.62.all.fasta
The ecoli reads used in these examples were obtained from the
Loman Labs. The
mini_align script uses minimap2 to align reads and runs some common post-alignment
processing steps like sorting and aligning the resulting bam file.
[user@cn3144]$ mini_align -r NC_000913.3.fasta \
-i R9_Ecoli_K12_MG1655_lambda_MinKNOW_0.51.1.62.all.fasta \
-t $SLURM_CPUS_PER_TASK -p ecoli
Constructing minimap index.
[M::mm_idx_gen::0.178*1.01] collected minimizers
[M::mm_idx_gen::0.220*1.38] sorted minimizers
[M::main::0.274*1.31] loaded/built the index for 1 target sequence(s)
[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 1
[M::mm_idx_stat::0.286*1.29] distinct minimizers: 838542 (98.18% are singletons); average occurrences: 1.034; average spacing: 5.352
[M::main] Version: 2.17-r941
[M::main] CMD: minimap2 -I 16G -x map-ont -d NC_000913.3.fasta.mmi NC_000913.3.fasta
[M::main] Real time: 0.292 sec; CPU: 0.376 sec; Peak RSS: 0.049 GB
[samfaipath] build FASTA index...
[M::main::0.070*1.02] loaded/built the index for 1 target sequence(s)
[M::mm_mapopt_update::0.086*1.01] mid_occ = 12
[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 1
[M::mm_idx_stat::0.097*1.01] distinct minimizers: 838542 (98.18% are singletons); average occurrences: 1.034; average spacing: 5.352
[M::worker_pipeline::128.218*5.37] mapped 72602 sequences
[M::worker_pipeline::203.551*5.30] mapped 59810 sequences
[M::main] Version: 2.17-r941
[M::main] CMD: minimap2 -x map-ont -t 6 -a NC_000913.3.fasta.mmi R9_Ecoli_K12_MG1655_lambda_MinKNOW_0.51.1.62.all.fasta
[M::main] Real time: 203.564 sec; CPU: 1078.051 sec; Peak RSS: 3.832 GB
[bam_sort_core] merging from 0 files and 6 in-memory blocks...
De novo assembly with miniasm
[user@cn3144]$ mini_assemble -i R9_Ecoli_K12_MG1655_lambda_MinKNOW_0.51.1.62.all.fasta \
-o ecoli_assm -t $SLURM_CPUS_PER_TASK -m 1 -c
...much output...
[user@cn3144]$ assess_assembly -r NC_000913.3.fasta \
-i ecoli_assm/reads_final.fa -t $SLURM_CPUS_PER_TASK
...
name mean q10 q50 q90
err_ont 2.033% 1.783% 1.925% 2.332%
err_bal 2.052% 1.797% 1.942% 2.357%
iden 0.320% 0.257% 0.296% 0.344%
del 0.821% 0.721% 0.788% 0.913%
ins 0.913% 0.764% 0.858% 1.079%
# Q Scores
name mean q10 q50 q90
err_ont 16.92 17.49 17.16 16.32
err_bal 16.88 17.46 17.12 16.28
iden 24.95 25.89 25.29 24.64
del 20.85 21.42 21.03 20.40
ins 20.40 21.17 20.67 19.67
[user@cn3144]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf]$
Create a batch input file (e.g. pomoxis.sh), which uses the input file 'pomoxis.in'. For example:
#!/bin/bash
wd=$PWD
module load pomoxis/0.3.15 || exit 1
cd /lscratch/$SLURM_JOB_ID
cp -L $POMOXIS_TEST_DATA/R9_Ecoli_K12_MG1655_lambda_MinKNOW_0.51.1.62.all.fasta .
mini_assemble -i R9_Ecoli_K12_MG1655_lambda_MinKNOW_0.51.1.62.all.fasta \
-o ecoli_assm -t $SLURM_CPUS_PER_TASK -m 1 -c
mv ecoli_assm $wd
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=6 --gres=lscratch:50 --mem=10g pomoxis.sh
Create a swarmfile (e.g. pomoxis.swarm). For example:
mini_align -r NC_000913.3.fasta -i expt1.fastq -t $SLURM_CPUS_PER_TASK -p expt1 mini_align -r NC_000913.3.fasta -i expt2.fastq -t $SLURM_CPUS_PER_TASK -p expt2 mini_align -r NC_000913.3.fasta -i expt3.fastq -t $SLURM_CPUS_PER_TASK -p expt3
Submit this job using the swarm command.
swarm -f pomoxis.swarm -g 10 -t 6 --module pomoxis/0.2.3where
| -g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
| -t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
| --module pomoxis | Loads the pomoxis module for each subjob in the swarm |