Dali on Biowulf

The three-dimensional co-ordinates of each protein are used to calculate residue - residue distance matrices.

References:

Documentation
Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program.
Sample session (user input in bold):

[user@biowulf]$ sinteractive
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144 ~]$ module load dali
[user@cn3144 ~]$ cp /pdb/pdb/pp/pdb1ppt.ent.gz .
[user@cn3144 ~]$ cp /pdb/pdb/bb/pdb1bba.ent.gz .
[user@cn3144 ~]$ import.pl --pdbfile pdb1ppt.ent.gz --pdbid 1ppt --dat ./
[user@cn3144 ~]$ import.pl --pdbfile pdb1bba.ent.gz --pdbid 1bba --dat ./
[user@cn3144 ~]$ dali.pl --pdbfile1 pdb1ppt.ent.gz --pdbfile2 pdb1bba.ent.gz --dat1 ./ --dat2 ./ --outfmt "summary,alignments"
[user@cn3144 ~]$ cat mol1A.txt
# Job: test
# Query: mol1A
# No:  Chain   Z    rmsd lali nres  %id PDB  Description
   1:  mol2-A  3.6  1.8   33    36   39   MOLECULE: BOVINE PANCREATIC POLYPEPTIDE;

# Pairwise alignments

No 1: Query=mol1A Sbjct=mol2A Z-score=3.6

DSSP  LLLLLLLLLLLLLHHHHHHHHHHHHHHHHHHLLlll
Query GPSQPTYPGDDAPVEDLIRFYDNLQQYLNVVTRhry   36
ident  |  | |||| |  |        |  | |  ||
Sbjct APLEPEYPGDNATPEQMAQYAAELRRYINMLTRpry   36
DSSP  LLLLLLLLLLLLLLLHHHHHHHHHHHHHHHHLLlll

[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$

Batch job
Most jobs should be run as batch jobs.

Create a batch input file (e.g. dali.sh). For example:

#!/bin/bash
set -e
module load dali
import.pl --pdbfile pdb1ppt.ent.gz --pdbid 1ppt --dat ./
import.pl --pdbfile pdb1bba.ent.gz --pdbid 1bba --dat ./
dali.pl --pdbfile1 pdb1ppt.ent.gz --pdbfile2 pdb1bba.ent.gz --dat1 ./ --dat2 ./ --outfmt "summary,alignments"

Submit this job using the Slurm sbatch command.

sbatch dali.sh
MPI batch job

In certain circumstances, dali can be accelerated using MPI. To do so, include --NP $SLURM_NTASKS with the command, and submit the job using -n # -N 1 , where # is the number of MPI tasks requested. MPI only works on a single node, so # must be less than 56. If # is greater than 8, you must include -p multinode.

...
dali.pl --np $SLURM_NTASKS ...
...

Submit this job using the Slurm sbatch command.

sbatch -n 32 -N 1 -p multinode dali.sh
AlphaFold searching

Running with the AlphaFold database:

#!/bin/bash
module load dali
dali.pl \
  --hierarchical \
  --oneway \
  --BLAST_DB ${DALI_AF}/Digest/AF.fasta \
  --pdbfile1 ${DALI_HOME}/example/my.pdb \
  --db ${DALI_AF}/Digest/HUMAN.list \
  --repset ${DALI_AF}/Digest/HUMAN_70.list \
  --dat1 ./ \
  --dat2 ${DALI_AF}/DAT/ \
  --title "my search" \
  --np ${SLURM_NTASKS}

Type ls ${DALI_AF}/Digest to see all the lists.