SIFT (5.x) on Biowulf

SIFT predicts whether an amino acid substitution affects protein function based on sequence homology and the physical properties of amino acids. SIFT can be applied to naturally occurring nonsynonymous polymorphisms and laboratory-induced missense mutations.

References:

Documentation

The documentation here pertains to SIFT version 5.x. For version 6.x, click here.

Important Notes

SIFT can require a large amount of disk space. The environment variable $SIFT_SCRATCHDIR is set to /lscratch/$SLURM_JOB_ID by default, but can be changed.

export SIFT_SCRATCHDIR=/path/to/new/tmp/area

SIFT can use the following databases for protein alignment:

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program. Sample session:

[user@biowulf]$ sinteractive
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144 ~]$ module load SIFT/5.2.2
[user@cn3144 ~]$ cp $SIFTHOME/test/lacI.fasta .
[user@cn3144 ~]$ seqs_chosen_via_median_info.csh lacI.fasta /fdb/blastdb/nr 2.75
[user@cn3144 ~]$ mv $SIFT_SCRATCHDIR/lacI.* .
[user@cn3144 ~]$ cp $SIFTHOME/test/snvs_build37.input .
[user@cn3144 ~]$ SIFT_exome_nssnvs.pl -i snvs_build37.input -d $SIFTDB/Human_db_37 -o $SIFT_SCRATCHDIR -z output.tsv

[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$

Batch job
Most jobs should be run as batch jobs.

Create a batch input file (e.g. SIFT.sh). For example:

#!/bin/bash
module load SIFT/5.2.2
cp $SIFTHOME/test/snvs_build37.input .
SIFT_exome_nssnvs.pl -i snvs_build37.input -d $SIFTDB/Human_db_37 -o $SIFT_SCRATCHDIR -z output.tsv

Submit this job using the Slurm sbatch command.

sbatch [--cpus-per-task=#] [--mem=#] TEMPLATE.sh