Qcat on NIH HPC Systems
Qcat is Python command-line tool for demultiplexing Oxford Nanopore reads from FASTQ files. It accepts basecalled FASTQ files and splits the reads into into separate FASTQ files based on their barcode. Qcat makes the demultiplexing algorithms used in albacore/guppy and EPI2ME available to be used locally with FASTQ files. Currently qcat implements the EPI2ME algorithm.
Batch job on Biowulf
Create a batch input file (e.g. qcat.sh). For example:
#!/bin/bash cd /data/$USER/dir module load qcat qcat -f <fastq_file> -b <output directory>
Then submit the file on biowulf
sbatch qcat.sh
Please read user guide for more sbatch options
Useful utilities for job monitoring and debugging
Swarm of Jobs on Biowulf
Create a swarmfile (e.g. qcat.swarm). For example:
# this file is called qcat.swarm cd dir1; qcat -f <fastq_file> -b <output directory> cd dir2; qcat -f <fastq_file> -b <output directory> cd dir3; qcat -f <fastq_file> -b <output directory> [...]
Submit this job using the swarm command.
biowulf >$ swarm -f qcat.swarm --module qcat
More options for swarm job can be viewed here.
Interactive job on Biowulf
Allocate an interactive session. Sample session:
[USER@biowulf ~]$ sinteractive --mem=10g salloc.exe: Pending job allocation 15194042 salloc.exe: job 15194042 queued and waiting for resources salloc.exe: job 15194042 has been allocated resources salloc.exe: Granted job allocation 15194042 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn1719 are ready for job [USER@cn1719 ~]$ module load qcat [USER@cn1719 ~]$ qcat -f <fastq_file> -b <output directory>
Documentation