High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
netMHC on Biowulf2 & Helix

NetMHC 4.0 software predicts binding of peptides to a number of different HLA alleles using artificial neural networks (ANN). The NetMHC method is described in detail in the following four articles:

netMHC was developed by the Center gor Biological Sequence Analysis at Technical University of Denmark DTU. [NetMHC website]

On Helix
Sample sesssion:
[user@helix ~]$ module load netMHC

[user@helix ~]$ netMHC $NMHOME/test/test.fsa
# /usr/local/apps/netMHC/netMHC-4.0/Linux_x86_64/bin/netMHC /usr/local/apps/netMHC/netMHC-4.0/test/test.fsa
# Mon Nov 21 16:12:55 2016
# User: user
# PWD : /scratch/user
# Host: Linux biowulf.nih.gov 2.6.32-642.6.2.el6.x86_64 x86_64
# Command line parameters set to:
#	[-a line]            HLA-A0201            HLA allele name
#	[-f filename]                             Input file (by default in FASTA format)
#	[-p]                 0                    Switch on if input is a list of peptides (Peptide format)
#	[-l string]          9                    Peptide length (multiple lengths separated by comma e.g. 8,9,10)
#	[-s]                 0                    Sort output on decreasing affinity
#	[-rth float]         0.500000             Threshold for high binding peptides (%Rank)
#	[-rlt float]         2.000000             Threshold for low binding peptides (%Rank)
#	[-listMHC]           0                    Print list of alleles included in netMHC
#	[-xls]               0                    Save output to xls file
#	[-xlsfile filename]  NetMHC_out.xls       File name for xls output
#	[-t float]           -99.900002           Threshold for output
#	[-thrfmt filename]   /usr/local/apps/netMHC/netMHC-4.0/Linux_x86_64/data/threshold/%s.thr Format for threshold filenames
#	[-hlalist filename]  /usr/local/apps/netMHC/netMHC-4.0/Linux_x86_64/data/allelelist File with covered HLA names
#	[-rdir filename]     /usr/local/apps/netMHC/netMHC-4.0/Linux_x86_64 Home directory for NetMHC
#	[-tdir filename]     /scratch             Temporary directory (Default $$)
#	[-syn filename]      /usr/local/apps/netMHC/netMHC-4.0/Linux_x86_64/data/synlists/%s.synlist Format of synlist file
#	[-v]                 0                    Verbose mode
#	[-dirty]             0                    Dirty mode, leave tmp dir+files
#	[-inptype int]       0                    Input type [0] FASTA [1] Peptide
#	[-version filename]  /usr/local/apps/netMHC/netMHC-4.0/Linux_x86_64/data/version File with version information
#	[-w]                 0                    w option for webface

# NetMHC version 4.0

# Input is in FSA format

# Peptide length 9
# Rank Threshold for Strong binding peptides   0.500
# Rank Threshold for Weak binding peptides   2.000
-----------------------------------------------------------------------------------
  pos          HLA      peptide         Core Offset  I_pos  I_len  D_pos  D_len        iCore        Identity 1-log50k(aff) Affinity(nM)    %Rank  BindLevel
-----------------------------------------------------------------------------------
    0    HLA-A0201    TMDKSELVQ    TMDKSELVQ      0      0      0      0      0    TMDKSELVQ 143B_BOVIN_P293         0.051     28676.59    43.00
    1    HLA-A0201    MDKSELVQK    MDKSELVQK      0      0      0      0      0    MDKSELVQK 143B_BOVIN_P293         0.030     36155.15    70.00
    2    HLA-A0201    DKSELVQKA    DKSELVQKA      0      0      0      0      0    DKSELVQKA 143B_BOVIN_P293         0.030     36188.42    70.00
    3    HLA-A0201    KSELVQKAK    KSELVQKAK      0      0      0      0      0    KSELVQKAK 143B_BOVIN_P293         0.032     35203.22    65.00
    4    HLA-A0201    SELVQKAKL    SELVQKAKL      0      0      0      0      0    SELVQKAKL 143B_BOVIN_P293         0.031     35670.99    65.00
    5    HLA-A0201    ELVQKAKLA    ELVQKAKLA      0      0      0      0      0    ELVQKAKLA 143B_BOVIN_P293         0.080     21113.07    29.00
    6    HLA-A0201    LVQKAKLAE    LVQKAKLAE      0      0      0      0      0    LVQKAKLAE 143B_BOVIN_P293         0.027     37257.56    75.00
    7    HLA-A0201    VQKAKLAEQ    VQKAKLAEQ      0      0      0      0      0    VQKAKLAEQ 143B_BOVIN_P293         0.040     32404.62    55.00
    8    HLA-A0201    QKAKLAEQA    QKAKLAEQA      0      0      0      0      0    QKAKLAEQA 143B_BOVIN_P293         0.031     35588.50    65.00
    9    HLA-A0201    KAKLAEQAE    KAKLAEQAE      0      0      0      0      0    KAKLAEQAE 143B_BOVIN_P293         0.020     40215.09    85.00
   10    HLA-A0201    AKLAEQAER    AKLAEQAER      0      0      0      0      0    AKLAEQAER 143B_BOVIN_P293         0.015     42412.05    95.00
   [...]
   
Batch job on Biowulf

Sample batch script:

#! /bin/bash
# 
set -e

module load netMHC

netMHC $NMHOME/test/test.fsa

The batch file is submitted to the queue with a command similar to the following:

biowulf$ sbatch  myscript

netMHC is a single-threaded program. There is no point in requesting more than the default 2 CPUs. However, for large jobs you may require more than the default memory (4 GB). In that case, you can submit with

sbatch --mem=#g  myscript

where # = number of GB of memory.

Swarm of jobs on Biowulf2

To set up a swarm of defuse jobs, each running the subjobs in local mode, use a swarm file like this:

netMHC   prot1.fas
netMHC   prot2.fas
netMHC   prot3.fas
[...]

Then submit the swarm, requesting # memory for each task

biowulf$ swarm -g #  swarmfile  --module netMHC

See the swarm documentation for more details.

Interactive job on Biowulf

Allocate an interactive session and run netMHC on there. Sample session:

biowulf$ sinteractive  --mem=5G
salloc.exe: Granted job allocation 240602
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn0044 are ready for job
cn0044$ module load netMHC/b>
cn0044$ netMHC  myprot.fas
[...]
cn0044$ exit
salloc.exe: Relinquishing job allocation 240602
biowulf$
Documentation

Type 'netMHC on the command line to get a summary of options

netMHC website