USEARCH is a unique sequence analysis tool with thousands of users world-wide. USEARCH offers search and clustering algorithms that are often orders of magnitude faster than BLAST.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load usearch [user@cn3144 ~]$ cp -R $USEARCH_EXAMPLES/hmptut . [user@cn3144 ~]$ cd hmptut/scripts [user@cn3144 ~]$ bash run.bash [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch input file (e.g. usearch.sh). For example:
#!/bin/bash # --- This file is usearch.sh -- module load usearch usearch -cluster_fast seqs.fasta -id 0.9 -centroids nr.fasta
Submit this job using the Slurm sbatch command.
sbatch [--cpus-per-task=#] [--mem=#] usearch.sh
Create a swarmfile (e.g. usearch.swarm). For example:
usearch -cluster_fast seqs.fasta.1 -id 0.9 -centroids nr.fasta usearch -cluster_fast seqs.fasta.2 -id 0.9 -centroids nr.fasta usearch -cluster_fast seqs.fasta.3 -id 0.9 -centroids nr.fasta usearch -cluster_fast seqs.fasta.4 -id 0.9 -centroids nr.fasta
Submit this job using the swarm command.
swarm -f usearch.swarm [-g #] [-t #] --module usearchwhere
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
-t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
--module usearch | Loads the usearch module for each subjob in the swarm |