From the documentation:
Improved variant filtering and polishing via k-mer validation
$MERFIN_TEST_DATA
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --cpus-per-task=16 --mem=36g --gres=lscratch:100 salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144]$ cd /lscratch/$SLURM_JOB_ID [user@cn3144]$ module load merfin [user@cn3144]$ # unpack a meryl k-mer database created from PE 250pb illumina reads from the son of a trio [user@cn3144]$ # with meryl count k=21 reads.fastq.gz output HG002.k21.meryl [user@cn3144]$ # followed by excluding kmers with frequency of 1 [user@cn3144]$ tar -xzf ${MERFIN_TEST_DATA:-none}/HG002.k21.gt1.meryl.tar.gz [user@cn3144]$ cp ${MERFIN_TEST_DATA:-none}/{chr20.fasta.gz,ill.vcf.gz} . [user@cn3144]$ merfin -filter -sequence chr20.fasta.gz \ -memory 34 \ -threads $SLURM_CPUS_PER_TASK \ -readmers HG002.k21.gt1.meryl \ -vcf ill.vcf.gz \ -output test.merfin [user@cn3144]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf]$
Create a batch input file (e.g. merfin.sh), which uses the input file 'merfin.in'. For example:
#!/bin/bash module load merfin tar -xzf ${MERFIN_TEST_DATA:-none}/HG002.k21.gt1.meryl.tar.gz cp ${MERFIN_TEST_DATA:-none}/{chr20.fasta.gz,ill.vcf.gz} . merfin -filter -sequence chr20.fasta.gz \ -memory 34 \ -threads $SLURM_CPUS_PER_TASK \ -readmers HG002.k21.gt1.meryl \ -vcf ill.vcf.gz \ -output test.merfin
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=16 --mem=36g merfin.sh