VSEARCH supports de novo and reference based chimera detection, clustering, full-length and prefix dereplication, rereplication, reverse complementation, masking, all-vs-all pairwise global alignment, exact and global alignment searching, shuffling, subsampling and sorting. It also supports FASTQ file analysis, filtering, conversion and merging of paired-end reads.
Allocate an interactive session and run the program.
Sample session (user input in bold):
[user@biowulf]$ sinteractive salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load vsearch [user@cn3144 ~]$ ln -s /usr/local/apps/vsearch/vsearch-data/BioMarKs50k.fsa . [user@cn3144 ~]$ vsearch --cluster_fast BioMarKs50k.fsa --id 0.97 --centroids vsearch.out [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch input file (e.g. vsearch.sh). For example:
#!/bin/bash module load vsearch vsearch --usearch_global queries.fsa --db database.fsa --id 0.9 --alnout alnout.txt
Submit this job using the Slurm sbatch command.
sbatch [--cpus-per-task=#] [--mem=#] vsearch.sh
Create a swarmfile (e.g. vsearch.swarm). For example:
vsearch --usearch_global queries1.fsa --db database.fsa --id 0.9 --alnout alnout1.txt vsearch --usearch_global queries2.fsa --db database.fsa --id 0.9 --alnout alnout2.txt vsearch --usearch_global queries3.fsa --db database.fsa --id 0.9 --alnout alnout3.txt vsearch --usearch_global queries4.fsa --db database.fsa --id 0.9 --alnout alnout4.txt
Submit this job using the swarm command.
swarm -f vsearch.swarm [-g #] [-t #] --module vsearchwhere
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
-t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
--module vsearch | Loads the VSEARCH module for each subjob in the swarm |