Biowulf High Performance Computing at the NIH
UNet: a convolutional network for biomedical image segmentation

UNet is a winner of the ISBI bioimage segmentation challenge 2015. It relies on data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization.

This application is being used as a biological example in class #1 of the course "Deep Learning by Example on Biowulf".


Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program. Sample session:

[user@biowulf]$ sinteractive --mem=16g --gres=gpu:v100:1,lscratch:10 -c4 
[user@cn4471 ~]$module load unet 
[+] Loading cuDNN/8.0.3/CUDA-11.0 libraries...
[+] Loading CUDA Toolkit  11.0.3  ...
[+] Loading gcc 4.8.5  ...
[+] Loading Qt 5.9.4  ...
[+] Loading UNet 20210223  ...
The bin directory of the application includes three executable files:, and In order to display the usage message and available command line options for an executable, type its name followed by the option "-h". For example:
[user@cn4471 ~]$ -h 
usage: [-h] -d data_folder [-a augmentation_rate] [-b batch_size] [--beta tversky_beta] [-c object_class]
                [--cw class_weights] [--drop_rate drop_rate] [-e num_epochs] [-F start_filters] [-g num_gpus]
                [--gamma tversky_gamma] [-l learning_rate] [--loss loss_type] [-n] [-s] [--save_augmented] [-v]
                [-w] [-X X] [-Y Y]

optional arguments:
  -h, --help            show this help message and exit
  -a augmentation_rate, --augmentation_rate augmentation_rate
                        by how many folds to increase the effective data size; default=20
  -b batch_size, --bs batch_size
                        batch size; default=2
  --beta tversky_beta   class balancing weight in the Trersky index: TI = TP/(TP + beta*FP + (1-beta)*FN)
  -c object_class, --object_class object_class
                        Detected object class(es): memb | mito | multi; default = membr
  --cw class_weights    Comma-separated class weights
  --drop_rate drop_rate
                        drop rate; default=0.5
  -e num_epochs, --num_epochs num_epochs
                        number of epochs; default=160
  -F start_filters, --start_filters start_filters
                        num. filters used in the 1st convolution of the network model; default=64 if
                        object_class=membr; =8 if object_class=mito; and =48 if object_class=multi
  -g num_gpus, --num_gpus num_gpus
                        number of gpus to use; default=1
  --gamma tversky_gamma
                        a power in the Trersky focal loss
  -l learning_rate, --lr learning_rate
                        learning rate; default=1.e-4
  --loss loss_type      loss type: bce | cce | wcce | dice | jaccard | tversky; default: wcce if object_class ==
                        'multi' and bce otherwise
  -n, --no_augmentation
                        don't perform data augmentation, i.e. use original input data; default=False
  -s, --summary         print the model summary
  --save_augmented      save augmented data (in the subfolder 'augmented'
  -v, --verbose         increase the verbosity level of output
  -w, --load_weights    read weights from a checkpoint file
  -X X, --image_width X
                        image width; should be multiple of 16; default=256
  -Y Y, --image_height Y
                        image height; should be multiple of 16; default=256

required arguments:
  -d data_folder, --data_folder data_folder
                        data folder name, e.g. 'data_isbi' or 'data_hhmi'
In order to run the training executable on available sample data, first compy the data to your current folder:
[user@cn4471 ~]$ cp -r $UNET_DATA/* .
There are currently two sample datasets available, both comprising 2D EM images of Drosophila brain slices. The first dataset includes 30 pre-processed grayscale images together with corresponding binary masks for neural membranes from the ISBI Challenge. It comes together with the Keras UNet implementation code available at GitHub. This dataset is stored in the folder "data_isbi".

The second dataset, stored in the folder data_hhmi, includes 24 pre-processed grayscale images together with the corresponrding binary masks for both the neural membranes and mitochondria. This more challenging dataset was generated as a part of the Fly Brain Connectome project conducted at the Howard Hughes Medical Institute.

Here is the command to train the UNet on the augmented data from the 1st dataset under default options:
[user@cn4471 ~]$ -d data_isbi
Using Tensorflow backend.
Epoch 1/100
300/300 [==============================] - 28s 92ms/step - loss: 0.6890 - acc: 0.7793
Epoch 2/100
300/300 [==============================] - 21s 71ms/step - loss: 0.6809 - acc: 0.7817
Epoch 3/100
300/300 [==============================] - 21s 71ms/step - loss: 0.6731 - acc: 0.7815
Epoch 4/100
Epoch 99/100
300/300 [==============================] - 21s 71ms/step - loss: 0.0979 - acc: 0.9765
Epoch 100/100
300/300 [==============================] - 21s 71ms/step - loss: 0.0965 - acc: 0.9766
The trainig results, i.e. model weights, will be stored in the checkpoint file stored in the folder "checkpoints", in the HDF5 format,
in this particular case - in the file:
The prefix of the output checkpoint file can be changed through a command line option of the

We can now use this file to predict membrane masks using as input 30 unaugmented grayscale images:
[user@cn4471 ~]$ -d data_isbi
Using TensorFlow backend.
30/30 [==============================] - 2s 81ms/step
For each the grayscale image file i.png (i=0,1,...,29), this command will produce a binary mask i_predict.png together with an RGB image i_predict_RGB.png with colored the connected components of the binary image.

The predictions will be stored in the folder data_isbi/membrane/test.

In order to visualize these results, use the application:
[user@cn4471 ~]$ -h 
usage: [-h] [-c object_class] [-d data_folder] [-i image_path] [-n image_id]

optional arguments:
  -h, --help            show this help message and exit
  -c object_class, --object_class object_class
                        Detected object class(es): membr | mito | multi; default = membr
  -d data_folder, --data_folder data_folder
                        path to the top data folder
  -i image_path, --image image_path
                        a path to the image to be visualized
  -n image_id, --image_id image_id
                        a number in the range(num_images)
Here, either the option -i or -n is required. For example, to visualize the 0-th data item, type:
[user@cn4471 ~]$ -n 0 -d data_isbi 

In order to use the second dataset, one can run similar and commands, but with option "-d data_hhmi":
[user@cn4471 ~]$ -d data_hhmi 
Using TensorFlow backend.
Epoch 1/100
300/300 [==============================] - 31s 104ms/step - loss: 0.1928 - acc: 0.9111
Epoch 2/100
300/300 [==============================] - 22s 72ms/step - loss: 0.1388 - acc: 0.9335
Epoch 3/100
300/300 [==============================] - 21s 71ms/step - loss: 0.1269 - acc: 0.9395
Epoch 159/160
240/240 [==============================] - 17s 72ms/step - loss: 0.0351 - acc: 0.9856
Epoch 160/160
240/240 [==============================] - 17s 72ms/step - loss: 0.0349 - acc: 0.9856
This command will produce a checkpoint file hhmi.membrane.h5 in folder "checkpoints".
Alternatively, a folder with already pre-computed checkpoint files can be copied from $UNET_DATA:
[user@cn4471 ~]$ cp -r $UNET_DATA/checkpoints . 
Now we can run the executable
[user@cn4471 ~]$ -d data_hhmi 
Using TensorFlow backend
24/24 [==============================] - 2s 98ms/step
[user@cn4471 ~]$ -n 1 -d data_hhmi

Likewise, training the unet model on the HHMI mitochondria data can be performed, with subsequent prediction and visualization of the binary segmentation of mitochondria:
[user@cn4471 ~]$ -d data_hhmi -c mito 
[user@cn4471 ~]$ -d data_hhmi -c mito 
[user@cn4471 ~]$ -n 1 -d data_hhmi -c mito

Using additional command line options, one can also
- store checkpoints for each epochs, rather than for the last epoch only;
- output a summary of the network model;
- change the data type for the 2nd dataset from "membrane" (default) to "mito" (=mitochondria); as well as
- vary other hyper-parameters, such as the number of training epochs, the batch size, the number of images produced during augmentation of the training data, etc.

In order to train the UNet using multiple GPUs,
- allocate a session with appropriate number of GPUs (you are allowed to use up to 4 GPUs per session),
- specify through a command line option -g how many GPUs you want to use, and
- specify a batch size that is multiple of the number of GPUs you will be using.
For example:
[user@cn4471 ~]$ exit
[user@biowulf ~] sinteractive --mem=16g --gres=gpu:v100:4,lscratch:40 -c14 
[user@cn4471 ~]$ module load unet 
[user@cn4471 ~]$ cp -r $UNET_DATA/* .
[user@cn4471 ~]$ -d data_isbi -g 4 -b 8 
Using TensorFlow backend.
 StreamExecutor with strength 1 edge matrix:
2019-04-23 07:38:17.419226: I tensorflow/core/common_runtime/gpu/]      0 1 2 3 
2019-04-23 07:38:17.419241: I tensorflow/core/common_runtime/gpu/] 0:   N Y N N 
2019-04-23 07:38:17.419252: I tensorflow/core/common_runtime/gpu/] 1:   Y N N N 
2019-04-23 07:38:17.419262: I tensorflow/core/common_runtime/gpu/] 2:   N N N Y 
2019-04-23 07:38:17.419271: I tensorflow/core/common_runtime/gpu/] 3:   N N Y N
Epoch 1/160
18/18 [==============================] - 28s 2s/step - loss: 0.6806 - acc: 0.7497
Epoch 2/160
18/18 [==============================] - 11s 588ms/step - loss: 0.5285 - acc: 0.7811
Epoch 3/160
18/18 [==============================] - 11s 590ms/step - loss: 0.4687 - acc: 0.7814
End the interactive session:
[user@cn4471 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$
Batch job
Most jobs should be run as batch jobs.

Create a batch input file (e.g. For example:

module load unet 
cp -r $UNET_DATA/* .  -e 100 -l 0.0001

Submit this job using the Slurm sbatch command.

sbatch [--cpus-per-task=#] [--mem=#]