pymc on Biowulf

PyMC is a new open source Probabilistic Programming framework written in Python that uses Theano to compute gradients via automatic differentiation as well as compile probabilistic programs on-the-fly to C for increased speed. Contrary to other Probabilistic Programming languages, PyMC3 allows model specification directly in Python code.

Documentation
Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program. Sample session:

[user@biowulf]$ sinteractive -c 16 --mem 40g
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144 ]$ module load pymc/3
[+] Loading singularity on cn3344 
[+] Loading pymc 3.11.5  on cn3344 

[user@cn3144 ]$ python-pymc
Python 3.9.13 | packaged by conda-forge | (main, May 27 2022, 16:56:21) 
[GCC 10.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pymc3 as pm
>>> import theano

[user@cn3144 ]$ module load pymc/4
[+] Loading singularity on cn3344
[+] Loading pymc 4.1.3  on cn3344

[user@cn3144 ]$ python-pymc
Python 3.10.5 | packaged by conda-forge | (main, Jun 14 2022, 07:04:59) [GCC 10.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pymc as pm
>>> import aesara
[user@cn3144 ]$ exit

salloc.exe: Relinquishing job allocation 46116226

Run pymc4 with JAX:

[user@biowulf ]$sinteractive --gres=gpu:p100:1
[user@cn3144 ]$ module load pymc/4
[+] Loading singularity on cn3344
[+] Loading pymc 4.1.3  on cn3344

[user@cn3144 ]$ python-pymc
Python 3.10.5 | packaged by conda-forge | (main, Jun 14 2022, 07:04:59) [GCC 10.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pymc.sampling_jax as pmjax
[user@cn3144 ]$ exit

Run pymc with jupyter notebook:

[user@biowulf ]$sinteractive -c 16 --mem 40g --tunnel
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job
Created 1 generic SSH tunnel(s) from this compute node to                  
biowulf for your use at port numbers defined                               
in the $PORTn ($PORT1, ...) environment variables.                         
                                                                           
                                                                           
Please create a SSH tunnel from your workstation to these ports on biowulf.
On Linux/MacOS, open a terminal and run:                                   
                                                                           
    ssh  -L 33327:localhost:33327 biowulf.nih.gov                          
                                                                           
For Windows instructions, see https://hpc.nih.gov/docs/tunneling          
[user@cn3144]$ module load pymc/3
[user@cn3144]$ jupyter notebook --ip localhost --port $PORT1 --no-browser
[I 17:11:40.505 NotebookApp] Serving notebooks from local directory
[I 17:11:40.505 NotebookApp] Jupyter Notebook 6.4.10 is running at:
[I 17:11:40.505 NotebookApp] http://localhost:37859/?token=xxxxxxxx
[I 17:11:40.506 NotebookApp]  or http://127.0.0.1:37859/?token=xxxxxxx
[I 17:11:40.506 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 17:11:40.512 NotebookApp]

    To access the notebook, open this file in a browser:
        file:///home/apptest1/.local/share/jupyter/runtime/nbserver-29841-open.html
    Or copy and paste one of these URLs:
        http://localhost:37859/?token=xxxxxxx
     or http://127.0.0.1:37859/?token=xxxxxxx

Then you can open a browser from your computer to connect to the jupyter notebook.

Batch job
Most jobs should be run as batch jobs.

Create a python script (e.g. pymc-script.py). For example:

import pymc3 as pm

Create a batch input file (e.g. pymc.sh). For example:

#!/bin/bash
module load pymc/3
python-pymc pymc-script.py 

Submit this job using the Slurm sbatch command.

sbatch -c 16 --mem 40g pymc.sh