SIFT (5.x) on Biowulf
SIFT predicts whether an amino acid substitution affects protein function based on sequence homology and the physical properties of amino acids. SIFT can be applied to naturally occurring nonsynonymous polymorphisms and laboratory-induced missense mutations.
References:
- Sim NL, Kumar P, Hu J, Henikoff S, Schneider G, Ng PC. Nucleic Acids Res. 2012 Jul;40
Documentation
The documentation here pertains to SIFT version 5.x. For version 6.x, click here.
Important Notes
- Module Name: SIFT (see the modules page for more information)
- Multithreaded
- Environment variables set
- BLIMPS_DIR: directory where blimps is installed
- NCBI: directory where NCBI blast is installed
- SIFTDB: directory where SIFT reference files are held
- SIFTHOME: directory where SIFT is installed
- SIFT_SCRATCHDIR: directory where SIFT output is written
- Example files in $SIFT/home
- Reference data in /fdb/SIFT/
SIFT can require a large amount of disk space. The environment variable $SIFT_SCRATCHDIR is set to /lscratch/$SLURM_JOB_ID by default, but can be changed.
export SIFT_SCRATCHDIR=/path/to/new/tmp/area
SIFT can use the following databases for protein alignment:
- /fdb/blastdb/nr -- NCBI non-redundant
- $SIFTDB/uniref90.fa -- UniRef 90
- $SIFTDB/uniprotkb_swissprot.fa -- SwissProt
Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load SIFT/5.2.2 [user@cn3144 ~]$ cp $SIFTHOME/test/lacI.fasta . [user@cn3144 ~]$ seqs_chosen_via_median_info.csh lacI.fasta /fdb/blastdb/nr 2.75 [user@cn3144 ~]$ mv $SIFT_SCRATCHDIR/lacI.* . [user@cn3144 ~]$ cp $SIFTHOME/test/snvs_build37.input . [user@cn3144 ~]$ SIFT_exome_nssnvs.pl -i snvs_build37.input -d $SIFTDB/Human_db_37 -o $SIFT_SCRATCHDIR -z output.tsv [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Batch job
Most jobs should be run as batch jobs.
Create a batch input file (e.g. SIFT.sh). For example:
#!/bin/bash module load SIFT/5.2.2 cp $SIFTHOME/test/snvs_build37.input . SIFT_exome_nssnvs.pl -i snvs_build37.input -d $SIFTDB/Human_db_37 -o $SIFT_SCRATCHDIR -z output.tsv
Submit this job using the Slurm sbatch command.
sbatch [--cpus-per-task=#] [--mem=#] TEMPLATE.sh