Gurobi on Biowulf

Gurobi is a mathematical optimization solver. It is a commercial product developed by gurobi.com

On Biowulf, Gurobi is licensed for use by the members of the CDSL_Gurobi_users group only. It is installed in /data/CDSL_Gurobi_users and is not accessible by any other users.

The current version of Gurobi on Biowulf is 10.0.0; previous available versions are 9.1.0 and 9.0.0; some examples below may have been generated by a previous version.

If you are interested in using Gurobi on Biowulf, please contact Schaffer, Alejandro (NIH/NCI) [E] (alejandro.schaffer@nih.gov), who manages the Gurobi license and the CDSL_Gurobi_users group.

Gurobi should be cited in any publications that result from your use of Gurobi on Biowulf, either directly or via a 3rd-party application.

Documentation

Important Notes

Startup

A module file, accessible only to the members of the CDSL_Gurobi_users group, will set the appropriate paths and environment variables for Gurobi. Members of the CDSL_Gurobi_users group should add the following line into their /home/$USER/.bashrc

module use --prepend /data/CDSL_Gurobi_users/modules
This will allow you to load the Gurobi module with 'module load gurobi'.
License Server

The Gurobi license server runs on the Biowulf login node. When the login node is rebooted (monthly), the Gurobi license server will be terminated. If that happens, it will need to be restarted by a member of the CDSL_Gurobi_users group. Only one person needs to start the license server, after which any member of the group can run Gurobi.

License-related Gurobi docs

Check the status of the license server
On the Biowulf login node, run a 'ps' command and look for the 'grb_ts' process. e.g.
[user@biowulf]$ ps auxw | grep grb_ts
user  78377  0.0  0.0  14620  5732 ?        S    Jan21   0:01 grb_ts

Start up the license server
If you don't see the grb_ts process running, the license server needs to be started. The Gurobi license is currently set up to use port 41954.
[user@biowulf]$  module load gurobi
[+] Loading Gurobi 9.0.0  ...

[user@biowulf]$ grb_ts 
Gurobi Token Server version 9.0.0
Gurobi Token Server version 9.0.0 started: Wed Jan 22 09:54:26 2020
Gurobi Token Server syslog: /var/log/syslog or /var/log/messages
Gurobi license file: /data/CDSL_Gurobi_users/gurobi910/gurobi.lic
Gurobi Token Server use limit: 4096
In case you get an error because that port is in use, you may need to change the port. It is best to contact the Biowulf staff (staff@hpc.nih.gov) to ask for an unused port number, as the Biowulf firewall rules will also need to be changed for the new port. Once you have the new port number, make a copy of the license file as a backup (/data/CDSL_Gurobi_users/gurobi910/gurobi.lic), then edit the main file very carefully and change the port number. Then start the license server with grb_ts.

Stop the license server
In case the license server needs to be stopped:
[user@biowulf]$ grb_ts -s
Gurobi Token Server version 9.0.0
Gurobi Token Server (pid 78377) killed

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program.
Sample session (user input in bold):

[user@biowulf]$ sinteractive
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144 ~]$ module load gurobi
[+] Loading Gurobi 9.0.0  ...

[user@cn3144 ~]$ gurobi_cl $GUROBI_HOME/examples/data/coins.lp
Using license file /data/CDSL_Gurobi_users/gurobi910/gurobi.lic
Set parameter TokenServer to value biowulf.nih.gov
Set parameter TSPort to value 46325
Set parameter Username
Set parameter LogFile to value gurobi.log

Gurobi Optimizer version 9.0.0 build v9.0.0rc2 (linux64)
Copyright (c) 2019, Gurobi Optimization, LLC

Read LP format model from file /data/CDSL_Gurobi_users/gurobi910/linux64/examples/data/coins.lp
Reading time = 0.00 seconds
: 4 rows, 9 columns, 16 nonzeros
Optimize a model with 4 rows, 9 columns and 16 nonzeros
Model fingerprint: 0x5ce1d538
Variable types: 4 continuous, 5 integer (0 binary)
Coefficient statistics:
  Matrix range     [6e-02, 7e+00]
  Objective range  [1e-02, 1e+00]
  Bounds range     [5e+01, 1e+03]
  RHS range        [0e+00, 0e+00]
Found heuristic solution: objective -0.0000000
Presolve removed 1 rows and 5 columns
Presolve time: 0.00s
Presolved: 3 rows, 4 columns, 9 nonzeros
Variable types: 0 continuous, 4 integer (0 binary)

Root relaxation: objective 1.134615e+02, 2 iterations, 0.00 seconds

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

     0     0  113.46154    0    1   -0.00000  113.46154      -     -    0s
H    0     0                     113.4500000  113.46154  0.01%     -    0s
     0     0  113.46154    0    1  113.45000  113.46154  0.01%     -    0s

Explored 1 nodes (2 simplex iterations) in 0.02 seconds
Thread count was 32 (of 72 available processors)

Solution count 2: 113.45 -0

Optimal solution found (tolerance 1.00e-04)
Best objective 1.134500000000e+02, best bound 1.134500000000e+02, gap 0.0000%

[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$

Batch job
Most jobs should be run as batch jobs.

Create a batch input file (e.g. Gurobi.sh). For example:

#!/bin/bash
set -e
module load gurobi 
gurobi_cl   ResultFile=coins.sol   $GUROBI_HOME/examples/data/coins.lp

Submit this job using the Slurm sbatch command.

sbatch [--cpus-per-task=#] [--mem=#] [--time=DD-HH:MM:SS] Gurobi.sh
where
--cpus-per-task# Number of CPUs allocated for the job. (default=2)
--mem# Memory allocated for the job (default=4 GB)
--time=DD-HH:MM:SS Walltime allocated for the job. (default = 2 hrs)
Swarm of Jobs
A swarm of jobs is an easy way to submit a set of independent commands requiring identical resources.

Create a swarmfile (e.g. Gurobi.swarm). For example:

gurobi_cl   ResultFile=file1.sol   file1.lp
gurobi_cl   ResultFile=file2.sol   file2.lp
gurobi_cl   ResultFile=file3.sol   file3.lp
[...]

Submit this job using the swarm command.

swarm [-g #] [-t #] --module gurobi Gurobi.swarm
where
-g # Number of Gigabytes of memory required for each process (1 line in the swarm command file)
-t # Number of threads/CPUs required for each process (1 line in the swarm command file).
--module gurobi Loads the Gurobi module for each subjob in the swarm

Python API

To create and solve optimization models using the Python API for Gurobi, the user needs to install the appropriate version of the gurobipy package in their Python environment. That is, if you want to use Gurobi 10.0.0 on Biowulf, it is recommended to install the same version of gurobipy.

Installing the Python API for Gurobi
If the user already uses Anaconda for package management, it is recommended to install the Python API using the following command in the appropriate conda environment:

conda install -c gurobi gurobi=10.0.0
It could also be installed using pip in an active Python environment (managed through Anaconda or otherwise) as follows:
python -m pip install gurobipy==10.0.0
See another alternate way to install the API here: https://support.gurobi.com/hc/en-us/articles/360044290292-How-do-I-install-Gurobi-for-Python-

Running Gurobi using its Python API
Once the gurobipy module is installed in a Python environment, it can be imported into Python scripts with import gurobipy. To run a Python script containing the gurobipy module on the compute cluster, the environment module gurobi needs to be loaded beforehand. The easiest way to do this is to wrap the Python command in a shell script. For instance, if example.py is the Python script containing calls to gurobipy that needs to be run on the cluster, write bash_wrapper.sh as follows:
  #!/bin/bash
  module load gurobi/10.0.0  
  python example.py [arguments to example.py]
Submit the wrapper script to the cluster:
sbatch bash_wrapper.sh

Gurobi and R

To get Gurobi 10.0.0 to work with R, follow these steps.

  1. Make sure that a grb_ts process is running on the Biowulf login node. This is the Gurobi license process.
        ps -aux | grep grb_ts
    If the process is not running, the Gurobi license server needs to be started using the instructions above.

  2. Start an interactive session, set the environment variables, load the Gurobi module and start R as below.
        biowulf% sinteractive
        salloc: Pending job allocation 58068017
        salloc: job 58068017 queued and waiting for resources
        salloc: job 58068082 has been allocated resources
        salloc: Granted job allocation 58068082
        salloc: Waiting for resource configuration
        salloc: Nodes cn4294 are ready for job
        
        [user@cn4294 ~]$ GUROBI_HOME=/data/CDSL_Gurobi_users/gurobi1000/linux64
        [user@cn4294 ~]$ GRB_LICENSE_FILE=/data/CDSL_Gurobi_users/gurobi1000/gurobi.lic
        
        [user@cn4294 ~]$ module load gurobi
        
        [user@cn4294 ~]$ module load R
        
        [user@cn4294 ~]$ R
        
        > install.packages("/data/CDSL_Gurobi_users/gurobi1000/linux64/R/gurobi_10.0-0_R_4.2.0.tar.gz", repos=NULL)    
HATCHet

HATCHet is an algorithm to infer allele and clone-specific copy-number aberrations (CNAs), clone proportions, and whole-genome duplications (WGD) for several tumor clones jointly from multiple bulk-tumor samples of the same patient or from a single bulk-tumor sample. HATCHet has been designed and developped by Simone Zaccaria in the group of prof. Ben Raphael at Princeton University.

HATCHet documentation on github

HATCHet requires Gurobi, and therefore is only available to the members of the CDSL_Gurobi_users group on Biowulf. It has been installed as a conda environment: to use HATCHet, you will need to source the conda setup script as in the example below. HATCHet also requires specific versions of samtools and bcftools. bcftools 1.6 is included in samtools/1.6

Note: conda initializations in your .bashrc might cause problems with loading the hatchet module. In general it is best not to include any conda initializations in your startup files.

Sample HATCHet run in an interactive session (user input in bold)

biowulf% sinteractive
salloc.exe: Pending job allocation 9064763
salloc.exe: job 9064763 queued and waiting for resources
salloc.exe: job 9064763 has been allocated resources
salloc.exe: Granted job allocation 9064763
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn0878 are ready for job

cn0878% module use --prepend /data/CDSL_Gurobi_users/modules/ 

cn0878% module load gurobi
[+] Loading Gurobi 9.1.0  ...

cn0878% module load samtools/1.6 
[+] Loading samtools 1.6  ...

cn0878% source /data/CDSL_Gurobi_users/hatchet/conda/etc/profile.d/conda.sh

cn0878% which conda 
/data/CDSL_Gurobi_users/hatchet/conda/bin/conda

cn0878% conda activate hatchet

(hatchet) cn0878% conda env list
# conda environments:
#
base                     /data/CDSL_Gurobi_users/hatchet/conda
hatchet               *  /data/CDSL_Gurobi_users/hatchet/conda/envs/hatchet

(hatchet) cn0878% python3
Python 3.7.4 (default, Oct 29 2019, 10:15:53)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-18)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>> import hatchet
>> [...]
>> quit()

(hatchet) cn0878% conda deactivate

cn0878% exit
salloc.exe: Relinquishing job allocation 9064763

biowulf% 

Sample batch script:

#!/bin/bash

module use --prepend /data/CDSL_Gurobi_users/modules/ 
module load gurobi samtools/1.6
source /data/CDSL_Gurobi_users/hatchet/conda/etc/profile.d/conda.sh
conda activate hatchet

python3 << EOF
import hatchet
[...python commands...]
EOF

conda deactivate

See examples of HATCHet use at https://github.com/raphael-group/hatchet/blob/master/examples/demo-WES/demo-wes.sh and https://github.com/raphael-group/hatchet/tree/master/script.