Biowulf High Performance Computing at the NIH
Jupyter on Biowulf

Jupyter is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning and much more.

Documentation

Starting a Jupyter Instance

In order to connect to a jupyter notebook running on a compute node with the browser on your computer, it is necessary to establish a tunnel from your computer to biowulf and from biowulf to the compute node. sinteractive can automatically create the second leg of this tunnel (from biowulf to the compute node) when started with the -T/--tunnel option. For more details and information on how to set up the second part of the tunnel see our tunneling documentation.

Note that the python environment hosting the jupyter install is a minimal python environment. Use the pyX.X kernels to get the fully featured python environments.

Allocate an interactive session and start a jupyter instance as shown below. First, we launch tmux (or screen) on the login node so that we don't lose our session if our connection to the login node drops.

[user@biowulf]$ module load tmux # You can use screen instead; you don't need to module load it
[user@biowulf]$ tmux
[user@biowulf]$ sinteractive --gres=lscratch:5 --mem=10g --tunnel
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job
                                                                           
Created 1 generic SSH tunnel(s) from this compute node to                  
biowulf for your use at port numbers defined                               
in the $PORTn ($PORT1, ...) environment variables.                         
                                                                           
                                                                           
Please create a SSH tunnel from your workstation to these ports on biowulf.
On Linux/MacOS, open a terminal and run:                                   
                                                                           
    ssh  -L 33327:localhost:33327 biowulf.nih.gov                          
                                                                           
For Windows instructions, see https://hpc.nih.gov/docs/tunneling           

[user@cn3144]$ cd /lscratch/$SLURM_JOB_ID
[user@cn3144]$ module load jupyter
[user@cn3144]$ cp ${JUPYTER_TEST_DATA:-none}/* .
[user@cn3144]$ ls -lh
total 196K
-rw-r--r-- 1 user group 7.8K Oct  1 14:31 Pokemon.csv
-rw-r--r-- 1 user group 186K Oct  1 14:31 Seaborn_test.ipynb

[user@cn3144]$ jupyter kernelspec list
Available kernels:
  python3          /usr/local/Anaconda/envs/jupyter/lib/python3.6/site-packages/ipykernel/resources
  bash             /usr/local/Anaconda/envs/jupyter/share/jupyter/kernels/bash
  calysto_xonsh    /usr/local/Anaconda/envs/jupyter/share/jupyter/kernels/calysto_xonsh
  ir35             /usr/local/Anaconda/envs/jupyter/share/jupyter/kernels/ir35
  py2.7            /usr/local/Anaconda/envs/jupyter/share/jupyter/kernels/py2.7
  py3.5            /usr/local/Anaconda/envs/jupyter/share/jupyter/kernels/py3.5
  py3.6            /usr/local/Anaconda/envs/jupyter/share/jupyter/kernels/py3.6
  xonsh            /usr/local/Anaconda/envs/jupyter/share/jupyter/kernels/xonsh

In the example show here I will use the port reserved by sinteracive (33327). Please substitute the port number with the number actually set up by sinteractive. Note that environment variable $PORT1 is available for convenience.

[user@cn3144]$ jupyter notebook --ip localhost \ # or jupyter lab or jupyter console
                        --port $PORT1 \ # The port must be unique to avoid clashing with other users
                        --no-browser  # We're not running the browser on the compute node!
[I 12:48:25.645 NotebookApp] [nb_conda_kernels] enabled, 20 kernels found
[I 12:48:26.053 NotebookApp] [nb_anacondacloud] enabled
[I 12:48:26.077 NotebookApp] [nb_conda] enabled
[I 12:48:26.322 NotebookApp] ✓ nbpresent HTML export ENABLED
[W 12:48:26.323 NotebookApp] ✗ nbpresent PDF export DISABLED: No module named nbbrowserpdf.exporters.pdf
[I 12:48:26.330 NotebookApp] Serving notebooks from local directory: /spin1/users/user
[I 12:48:26.330 NotebookApp] 0 active kernels 
[I 12:48:26.330 NotebookApp] The Jupyter Notebook is running at: http://localhost:33327/?token=xxxxxxxxxx
[I 12:48:26.331 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 12:48:26.333 NotebookApp] 
    
    Copy/paste this URL into your browser when you connect for the first time,
    to login with a token:
        http://localhost:33327/?token=xxxxxxxxxx

Keep this open for as long as you're using your notebook.

For documentation on how to connect a tunnel from your computer to the tunnel on biowulf see our tunneling documentation.