paraview on Biowulf

ParaView is an open-source, multi-platform data analysis and visualization application. It can be run in different modes:

ParaView can be run on k20x GPU nodes (egl offscreen rendering) or, without GPU acceleration, on multinode (mesa offscreen rendering).


Important Notes

Client/Server mode

k20x GPU

In this example we will run a ParaView server as a batch job allocating one node with 2 K20X GPUs each used for hardware accelerated rendering. We then connect a ParaView client running on the desktop to the server for interactive analysis. Paraview can either use half a node (1GPU + CPUS), or multiples of whole nodes (i.e. 1, 2, ... nodes allocated exclusively using both GPUs). That is due to the way GPUs are utilized by ParaView. The batch example uses half a node to illustrate how to use $CUDA_VISIBLE_DEVICES. The 1 node client/server example can easily be extended to more than one node.

To start a ParaView server, create a batch script similar to the following:

#! /bin/bash
# this file is
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16
#SBATCH --ntasks-per-core=1
#SBATCH --partition=gpu
#SBATCH --gres=gpu:k20x:2
#SBATCH --exclusive
#SBATCH --mem=120g
set -e

module load paraview/5.10.1 || exit 1
pvopts="--disable-xdisplay-test --force-offscreen-rendering"

ntasks_per_gpu=$(( SLURM_NTASKS_PER_NODE / 2 ))

mpiexec -map-by node -iface enet0 \
    -np ${ntasks_per_gpu} pvserver ${pvopts} --displays=0 : \
    -np ${ntasks_per_gpu} pvserver ${pvopts} --displays=1

Note that the mpiexec is set up such that half the tasks are assigned to each of the two GPUs on each node. This will only work properly if only whole nodes are allocated.

In this example we will allocate just 1 node with 2GPUs. 8 tasks each share a single GPU for rendering. In some circumstances it may be better to use 2 tasks per core, for example. Since we are using sbatch directives we can submit with

[user@biowulf]$ sbatch

Multinode CPU rendering

In some cases the server may need more memory than available on the k20x nodes or exceed the k20x GPU memory. In such cases the mesa based CPU server can be used.

#! /bin/bash
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=28
#SBATCH --partition=multinode
#SBATCH --exclusive
#SBATCH --mem=240g
#SBATCH --constraint=x2695

set -e

module load paraview/5.10.1-mesa

pvopts="--disable-xdisplay-test --force-offscreen-rendering"
mpiexec -iface ib0 pvserver ${pvopts}

Client connections

The server will run until the job hits its time limit, is canceled, or is stopped when the client terminates its connection to the server.

Wait until the server job starts and check the output file for the port and hostname of the server

[user@biowulf]$ squeue -j 50077458
50077458       gpu paraview     user PD       0:00      1 (None)

[user@biowulf]$ squeue -j 50077458
50077458       gpu paraview     user  R       1:36      1 cn0605

[user@biowulf]$ cat slurm-50077458.out
[+] Loading paraview 5.10.1
Waiting for client...
Connection URL: cs://cn0605:11111
Accepting connection(s): cn0605:11111

Now we need a tunnel from your local machine to the compute node indicated by the pvserver process. Assuming your machine is on the NIH campus or on VPN this command will set up the tunnel on a Mac or a Linux machine to the node via biowulf:

[myComputer]$ ssh -L 11111:localhost:11111 \
                     -t ssh -L 11111:localhost:11111 user@cn0605

Replacing 'user' with your username and 'cn0605' with the node shown in the slurm output file.

Using putty on a Windows machine involves a 2-step process:

Once a tunnel has been established, start your local ParaView client and click the connect button. The brief screencast below shows how to connect and open an example data set.

Paraview client connection to server (4 min)


Batch mode

We will use a very simple python script that renders a sphere and colors it by the rank of the MPI process that created it:

# this file is
from paraview.simple import *

sphere = Sphere()
sphere.ThetaResolution = 32
rep = Show()
ColorBy(rep, ('POINTS', 'vtkProcessId'))

Then create a batch input file to run the python script on a single K20x GPU:

# this file is
module load paraview || exit 1

mpirun -iface enet0 -np $SLURM_NTASKS \
    pvbatch --egl-device-index=$CUDA_VISIBLE_DEVICES \
    --force-offscreen-rendering --disable-xdisplay-test

Submit this job using the Slurm sbatch command.

[user@biowulf]$ sbatch --ntasks=4 --partition=gpu --mem=10g \
                      --gres=gpu:k20x:1 --nodes=1