Biowulf High Performance Computing at the NIH
paraview on Biowulf

ParaView is an open-source, multi-platform data analysis and visualization application. It can be run in different modes:

ParaView should be run on the K20X GPU nodes, not the newer GPUs.

References:

Documentation
Important Notes

Client/Server mode

In this example we will run a ParaView server as a batch job allocating one node with 2 K20X GPUs each used for hardware accelerated rendering. We then connect a ParaView client running on the desktop to the server for interactive analysis. Paraview can either use half a node (1GPU + CPUS), or multiples of whole nodes (i.e. 1, 2, ... nodes allocated exclusively using both GPUs). That is due to the way GPUs are utilized by ParaView. The batch example uses half a node to illustrate how to use $CUDA_VISIBLE_DEVICES. The 1 node client/server example can easily be extended to more than one node.

To start a ParaView server, create a batch script similar to the following:

#! /bin/bash
# this file is paraview_server.sh
set -e


module load paraview/5.4.1 || exit 1
pvopts="--disable-xdisplay-test --use-offscreen-rendering"

ntasks_per_gpu=$(( SLURM_NTASKS / 2 ))

mpirun --map-by node --mca btl_openib_if_exclude "mlx4_0:1" \
    -np ${ntasks_per_gpu} pvserver ${pvopts} --egl-device-index=0 : \
    -np ${ntasks_per_gpu} pvserver ${pvopts} --egl-device-index=1

Note that the mpirun is set up such that half the tasks are assigned to each of the two GPUs on each node. This will only work properly if only whole nodes are allocated.

In this example we will allocate just 1 node with 2GPUs. 16 tasks each share a single GPU for rendering tasks. In some circumstances it may be better to use only 1 task per core, for example. This translates into the following sbatch submission command:

[user@biowulf]$ sbatch --ntasks=32 --ntasks-per-core=2 --partition=gpu --mem=120g \
                      --gres=gpu:k20x:2 --exclusive --nodes=1 paraview_server.sh
50077458

This server will run until the job hits its time limit, is canceled, or is stopped when the client terminates its connection to the server.

Wait until the job starts and check the output file for the port and hostname of the server

[user@biowulf]$ squeue -j 50077458
   JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
50077458       gpu paraview     user PD       0:00      1 (None)

[user@biowulf]$ squeue -j 50077458
   JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
50077458       gpu paraview     user  R       1:36      1 cn0605

[user@biowulf]$ cat slurm-50077458.out
[+] Loading openmpi 2.0.3 for GCC 7.2.0
[+] Loading paraview 5.4.1
Waiting for client...
Connection URL: cs://cn0605:11111
Accepting connection(s): cn0605:11111

Now we need a tunnel from your local machine to the compute node indicated by the pvserver process. Assuming your machine is on the NIH campus or on VPN this command will set up the tunnel on a Mac or a Linux machine to the node via biowulf:

[myComputer]$ ssh -L 11111:localhost:11111 user@biowulf.nih.gov \
                     -t ssh -L 11111:localhost:11111 user@cn0605

Replacing 'user' with your username and 'cn0605' with the node shown in the slurm output file.

Using putty on a Windows machine involves a 2-step process:

Once a tunnel has been established, start your local ParaView client and click the connect button. The brief screencast below shows how to connect and open an example data set.

Notes:

that disconnecting from the server terminates the server job. Note also that

Batch mode

We will use a very simple python script that renders a sphere and colors it by the rank of the MPI process that created it:

# this file is sphere.py
from paraview.simple import *

sphere = Sphere()
sphere.ThetaResolution = 32
rep = Show()
ColorBy(rep, ('POINTS', 'vtkProcessId'))
Render()
rep.RescaleTransferFunctionToDataRange(True)
Render()
WriteImage('sphere.png')

Then create a batch input file to run the python script on a single K20x GPU:

#!/bin/bash
# this file is sphere.sh
module load paraview || exit 1

mpirun --mca btl_openib_if_exclude "mlx4_0:1" -np $SLURM_NTASKS \
    pvbatch --egl-device-index=$CUDA_VISIBLE_DEVICES \
    --use-offscreen-rendering --disable-xdisplay-test shpere.py

Submit this job using the Slurm sbatch command.

[user@biowulf]$ sbatch --ntasks=4 --partition=gpu --mem=10g \
                      --gres=gpu:k20x:1 --nodes=1 sphere.sh