High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
SIFT

Description

SIFT predicts whether an amino acid substitution affects protein function based on sequence homology and the physical properties of amino acids. SIFT can be applied to naturally occurring nonsynonymous polymorphisms and laboratory-induced missense mutations.

How to Use

SIFT uses environment modules. Type

module load SIFT

Environment variables set:

SIFT can require a large amount of disk space. The environment variable $SIFT_SCRATCHDIR is set to /lscratch/$SLURM_JOB_ID by default, but can be changed.

export SIFT_SCRATCHDIR=/path/to/new/tmp/area

SIFT can use the following databases for protein alignment:

5.x series


Interactive Use

SIFT scores for mutations can be run using a FASTA file aligned to a reference database.

$ cp $SIFTHOME/test/lacI.fasta .
$ seqs_chosen_via_median_info.csh lacI.fasta /fdb/blastdb/nr 2.75
$ mv $SIFT_SCRATCHDIR/lacI.* .

SIFT can also generate scores for variants. The databases for SIFT are located in $SIFTDB.

$ cp $SIFTHOME/test/snvs_build37.input .
$ SIFT_exome_nssnvs.pl -i snvs_build37.input -d $SIFTDB/Human_db_37 -o $SIFT_SCRATCHDIR -z output.tsv

Batch Use

Here is an example SIFT script to run on the cluster:

#!/bin/bash
# this file is sift_script.sh
module load SIFT
cp $SIFTHOME/test/snvs_build37.input .
SIFT_exome_nssnvs.pl -i snvs_build37.input -d $SIFTDB/Human_db_37 -o $SIFT_SCRATCHDIR -z output.tsv

Then submit to the cluster:

sbatch --mem=10gb --gres=lscratch:10 sift_script.sh

6.x series


Interactive Use

SIFT scores for mutations can be run using a FASTA file aligned to a reference database.

$ cp $SIFTHOME/test/lacI.fasta .
$ SIFT_for_submitting_fasta_seq.csh lacI.fasta $SIFTDB/uniref90.fa -
$ mv $SIFT_SCRATCHDIR/lacI.* .

Batch Use

Here is an example SIFT script to run on the cluster:

#!/bin/bash
# this file is sift_script.sh
module load SIFT/6.0.1
cp $SIFTHOME/test/lacI.fasta .
SIFT_for_submitting_fasta_seq.csh lacI.fasta $SIFTDB/uniref90.fa -
mv $SIFT_SCRATCHDIR/lacI.* .

Then submit to the cluster:

sbatch --mem=16gb --gres=lscratch:10 sift_script.sh

Documentation