Quick Links
|
Extreme Mobility of Compute
Singularity containers let users run applications in a Linux environment of their choosing.
Possible uses for Singularity on Biowulf:
These definition files can all be found on GitHub, and the containers built from them are hosted on Singularity hub.
Additionally, a large number of staff maintained definition files and associated helper scripts can be found at this GitHub repo. These are files that staff members use to install containerized apps on the NIH HPC systems.
export SINGULARITY_CACHEDIR=/data/${USER}/.singularity
To use Singularity on Biowulf, you either need to create your own Singularity container, or use one created by someone else. You have several options to build Singularity containers:
You can find information about installing Singularity on Linux here.
In addition to your own Linux environment, you will also need a definition file to build a Singularity container from scratch. You can find some simple definition files for a variety of Linux distributions in the /example directory of the source code. You can also find a small list of definition files containing popular applications at the top of this page. Detailed documentation about building Singularity container images is available at the Singularity website.
Binding a directory to your Singularity container allows you to access files in a host system directory from within your container. By default, Singularity will bind your $HOME directory (along with a few other directories such as /tmp and /dev). You can also bind other directories into your Singularity container yourself. The process is described in detail in the Singularity documentation.
While $HOME is bind-mounted to the container by default, there are several filesystems on the NIH HPC systems that you may also want to include. Furthermore, if you are running a job and have allocated local scratch space, you might like to bind mount your /lscratch directory to /tmp in the container.
The following command opens a shell in a container while bind-mounting your data directory, /fdb, and /lscratch into the same path inside the container If you have access to shared data directories, you'll want to add them to the list as well (for example, /data/$USER,/data/mygroup1,/data/mygroup2,/fdb,...).
[user@cn1234 ~]$ singularity shell --bind /data/$USER,/fdb,/lscratch my-container.sifor, using the environment variable:
[user@cn1234 ~]$ export SINGULARITY_BINDPATH="/data/$USER,/fdb,/lscratch" [user@cn1234 ~]$ singularity shell my-container.sifIf you would like to store this in ~/.bashrc, you can also automatically bind /lscratch/$SLURM_JOB_ID to /tmp inside the container depending on whether a local scratch allocation is detected:
SINGULARITY_BINDPATH="/data/$USER,/fdb" [ -d /lscratch ] && SINGULARITY_BINDPATH="${SINGULARITY_BINDPATH},/lscratch/${SLURM_JOB_ID}:/tmp" export SINGULARITY_BINDPATHWhen you share this container, your colleagues here and elsewhere can bind their own corresponding directories to these same mountpoints. Finally, the NIH HPC staff maintains a file that will set the $SINGULARITY_BINDPATH environment variable appropriately for a wide variety of situations. It is considered a Biowulf best practice to source this file since it will be updated in the case of additions or deletions to the shared file system. You can source this file from the command prompt or from within a script like so:
[user@cn1234 ~]$ . /usr/local/current/singularity/app_conf/sing_binds
One use case of Singularity is to transparently use software in a container as though it were directly installed on the host system. To accomplish this on our systems, you need to be aware of the shared filesystem locations and bind mount the corresponding directories inside the container, which is more complicated than it seems because we use symbolic links to refer to some of our network storage systems. As a result, you will need to specify some directories in addition to the ones you use directly to ensure that the symbolic link destinations are also bound into the container.
If you wanted to take advantage of a Debian package this way and use it to install software into your home directory, for example samtools and bcftools, you would use a definition file, Singularity, with these contents:
Bootstrap: docker From: debian:9-slim %post # install the desired software apt-get update apt-get install -y samtools bcftools apt-get cleanThis defines a container based on the space-efficient "slim" Debian images from Docker Hub, installs the samtools and bcftools packages, and then creates the necessary symbolic links to our GPFS mounts to be able to use the container transparently.
After finalizing the definition file, you can proceed to build the container (of course, on a system where you have sudo or root access):
sudo singularity build hts.simg Singularity
You can then set up your installation prefix (here, it's $HOME/opt/hts) as follows, making use of symbolic links and a wrapper script:
$HOME/opt └── hts ├── bin │ ├── samtools -> ../libexec/wrap │ └── bcftools -> ../libexec/wrap └── libexec ├── wrap └── hts.simgwhere the wrapper script wrap looks like:
#!/bin/bash . /usr/local/current/singularity/app_conf/sing_binds selfdir="$(dirname $(readlink -f ${BASH_SOURCE[0]}))" instdir="$(dirname ${selfdir})" cmd="$(basename $0)" singularity exec -B "${instdir}" "${selfdir}/hts.simg" "$cmd" "$@"wrap checks to see how it was called, then passes that same command to the container after appropriately setting SINGULARITY_BINDPATH by calling the staff maintained sing_binds script.
So if you have added the installation prefix $HOME/opt/hts/bin to your PATH, then calling samtools or bcftools will run those programs from within your container. And because we have arranged to bind mount all the necessary filesystems into the container, the path names you provide for input and output into the programs will be available to the container in the same way.
Singularity cannot be run on the Biowulf login node.
To run a Singularity container image on Biowulf interactively, you need to allocate an interactive session, and load the Singularity module. In this sample session (user input in bold), an Ubuntu 16.04 Singularity container is downloaded and run from Docker Hub. If you want to run a local Singularity container instead of downloading one, just replace the DockerHub URL with the path to your container image file.
[user@biowulf ~]$ sinteractive --cpus-per-task=4 --mem=10g salloc.exe: Pending job allocation 43131269 salloc.exe: job 43131269 queued and waiting for resources salloc.exe: job 43131269 has been allocated resources salloc.exe: Granted job allocation 43131269 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn0123 are ready for job srun: error: x11: no local DISPLAY defined, skipping [user@cn0123 ~]$ module load singularity [+] Loading singularity 2.4 on cn3160 [user@cn0123 ~]$ singularity shell docker://ubuntu Docker image path: index.docker.io/library/ubuntu:latest Cache folder set to /spin1/home/linux/user/.singularity/docker Creating container runtime... WARNING: Bind file source does not exist on host: /etc/resolv.conf Singularity: Invoking an interactive shell within container... Singularity ubuntu:~> cat /etc/os-release NAME="Ubuntu" VERSION="16.04.3 LTS (Xenial Xerus)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 16.04.3 LTS" VERSION_ID="16.04" HOME_URL="http://www.ubuntu.com/" SUPPORT_URL="http://help.ubuntu.com/" BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/" VERSION_CODENAME=xenial UBUNTU_CODENAME=xenial Singularity ubuntu:~> exit [user@cn0123 ~]$ exit exit salloc.exe: Relinquishing job allocation 23562157 [user@biowulf ~]$Note that you need to exit your Singularity container as well as your allocated interactive Slurm session when you are done.
Expand the tab below to view a demo of interactive Singularity usage.
In this example, singularity will be used to run a TensorFlow example in an Ubuntu 16.04 container. (User input in bold).
First, create a container image on a machine where you have root privileges. These commands were run on a Google Cloud VM instance running an Ubuntu 16.04 image, and the Singularity container was created using this definition file that includes a TensorFlow installation.
[user@someCloud ~]$ sudo singularity build ubuntu_w_TFlow.simg ubuntu_w_TFlow.def
Next, copy the TensorFlow script that you want to run into your home directory, or another directory that will be visible from within the container at runtime. (See 'binding external directories' above). In this case, this example script from the TensorFlow website was copied to /home/$USER, and the container was moved to the user's data directory
[user@someCloud ~]$ scp TFlow_example.py user@biowulf.nih.gov: [user@someCloud ~]$ scp ubuntu_w_Tflow.simg user@biowulf.nih.gov:/data/user
Then ssh to Biowulf and write a batch script to run the singularity command similar to this:
#!/bin/sh # file called myjob.batch set -e module load singularity cd /data/user singularity exec ubuntu_w_TFlow.simg python ~/TFlow_example.py
Submit the job like so:
[user@biowulf ~]$ sbatch myjob.batch
After the job finishes executing you should see the following output in the slurm*.out file.
[+] Loading singularity 2.4 on cn2725 (0, array([-0.39398459], dtype=float32), array([ 0.78525567], dtype=float32)) (20, array([-0.05549375], dtype=float32), array([ 0.38339305], dtype=float32)) (40, array([ 0.05872268], dtype=float32), array([ 0.3221375], dtype=float32)) (60, array([ 0.08904253], dtype=float32), array([ 0.30587664], dtype=float32)) (80, array([ 0.09709124], dtype=float32), array([ 0.30156001], dtype=float32)) (100, array([ 0.09922785], dtype=float32), array([ 0.30041414], dtype=float32)) (120, array([ 0.09979502], dtype=float32), array([ 0.30010995], dtype=float32)) (140, array([ 0.09994559], dtype=float32), array([ 0.30002919], dtype=float32)) (160, array([ 0.09998555], dtype=float32), array([ 0.30000776], dtype=float32)) (180, array([ 0.09999616], dtype=float32), array([ 0.30000207], dtype=float32)) (200, array([ 0.09999899], dtype=float32), array([ 0.30000055], dtype=float32))
Expand the tab below to watch a quick demo of Singularity in batch mode.
With the release of Singularity v2.3 it is no longer necessary to install NVIDIA drivers into your Singularity container to access the GPU on a host node. If you still want the deprecated gpu4singularity script that was used to install NVIDIA drivers within containers for use on our GPU nodes you can find it on GitHub.
Now, you can simply use the --nv option to grant your containers GPU support at runtime. Consider the following example in which we will download some TensorFlow models to the user's home directory and then run the latest TensorFlow container from DockerHub to train a model on the MNIST handwritten digit data set using a GPU node.
[user@biowulf ~]$ git clone https://github.com/tensorflow/models.git Initialized empty Git repository in /home/user/models/.git/ remote: Counting objects: 4971, done. remote: Compressing objects: 100% (26/26), done. remote: Total 4971 (delta 14), reused 11 (delta 2), pack-reused 4943 Receiving objects: 100% (4971/4971), 153.50 MiB | 12.21 MiB/s, done. Resolving deltas: 100% (2540/2540), done. [user@biowulf ~]$ sinteractive --constraint=gpuk80 --gres=gpu:k80:1 salloc.exe: Pending job allocation 39836528 salloc.exe: job 39836528 queued and waiting for resources salloc.exe: job 39836528 has been allocated resources salloc.exe: Granted job allocation 39836528 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn4178 are ready for job srun: error: x11: no local DISPLAY defined, skipping [user@cn4178 ~]$ module load singularity [+] Loading singularity 2.4 on cn4178 [user@cn4178 ~]$ singularity exec --nv docker://tensorflow/tensorflow:latest-gpu \ python ~/models/tutorials/image/mnist/convolutional.py Docker image path: index.docker.io/tensorflow/tensorflow:latest-gpu Cache folder set to /spin1/home/linux/user/.singularity/docker [19/19] |===================================| 100.0% Creating container runtime... WARNING: Bind file source does not exist on host: /etc/resolv.conf Extracting data/train-images-idx3-ubyte.gz Extracting data/train-labels-idx1-ubyte.gz Extracting data/t10k-images-idx3-ubyte.gz Extracting data/t10k-labels-idx1-ubyte.gz 2017-06-14 18:54:40.157855: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. 2017-06-14 18:54:40.157887: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. 2017-06-14 18:54:40.157896: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 2017-06-14 18:54:40.157903: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. 2017-06-14 18:54:40.157911: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. 2017-06-14 18:54:40.737822: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties: name: Tesla K80 major: 3 minor: 7 memoryClockRate (GHz) 0.8235 pciBusID 0000:84:00.0 Total memory: 11.92GiB Free memory: 11.86GiB 2017-06-14 18:54:40.737858: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0 2017-06-14 18:54:40.737867: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0: Y 2017-06-14 18:54:40.737881: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:84:00.0) Initialized! Step 0 (epoch 0.00), 17.1 ms Minibatch loss: 8.334, learning rate: 0.010000 Minibatch error: 85.9% Validation error: 84.6% Step 100 (epoch 0.12), 13.4 ms Minibatch loss: 3.254, learning rate: 0.010000 Minibatch error: 3.1% Validation error: 7.8% Step 200 (epoch 0.23), 11.6 ms Minibatch loss: 3.354, learning rate: 0.010000 Minibatch error: 10.9% Validation error: 4.5% Step 300 (epoch 0.35), 11.5 ms [...snip...]
Expand the tab below to see a demo of installing and using GPU support in a Singularity container.
Singularity can import, bootstrap, and even run Docker images directly from Docker Hub. For instance, the following commands will start an Ubuntu container running on a compute node with no need for a definition file or container image! And, of course, we remember to set SINGULARITY_BINDPATH appropriately to be able to access all our files.
[user@cn0123 ~]$ module load singularity [+] Loading singularity on cn0123 [user@cn0123 ~]$ . /usr/local/current/singularity/app_conf/sing_binds [user@cn0123 ~]$ singularity shell docker://ubuntu:latest Docker image path: index.docker.io/library/ubuntu:latest Cache folder set to /spin1/home/linux/user/.singularity/docker [5/5] |===================================| 100.0% Creating container runtime... WARNING: Bind file source does not exist on host: /etc/resolv.conf Singularity: Invoking an interactive shell within container... Singularity.ubuntu:latest> cat /etc/os-release NAME="Ubuntu" VERSION="16.04.3 LTS (Xenial Xerus)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 16.04.3 LTS" VERSION_ID="16.04" HOME_URL="http://www.ubuntu.com/" SUPPORT_URL="http://help.ubuntu.com/" BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/" VERSION_CODENAME=xenial UBUNTU_CODENAME=xenialIn this instance the container is ephemeral. It will disappear as soon as you exit the shell. If you wanted to actually download the container from Docker Hub, you could use the pull command like so:
[user@cn0123 ~]$ singularity pull docker://ubuntu:latest WARNING: pull for Docker Hub is not guaranteed to produce the WARNING: same image on repeated pull. Use Singularity Registry WARNING: (shub://) to pull exactly equivalent images. Docker image path: index.docker.io/library/ubuntu:latest Cache folder set to /spin1/home/linux/user/.singularity/docker Importing: base Singularity environment Importing: /spin1/home/linux/user/.singularity/docker/sha256:ae79f251470513c2a0ec750117a81f2d58a50727901ca416efecf297b8a03913.tar.gz Importing: /spin1/home/linux/user/.singularity/docker/sha256:c59d01a7e4caf1aba785eb33192fec3f96e4ab01975962bcec10f4989a6edcc6.tar.gz Importing: /spin1/home/linux/user/.singularity/docker/sha256:41ba73a9054d231e1f555c40a74762276900cc6487f5c6cf13b89c7606635c67.tar.gz Importing: /spin1/home/linux/user/.singularity/docker/sha256:f1bbfd495cc1112b503350686641ee4fa2cea8ccd13fb8a8a302c81dae61d418.tar.gz Importing: /spin1/home/linux/user/.singularity/docker/sha256:0c346f7223e24b517358f52c4a3f5f9af1c86e5ddeaee5659c8889846e46d1e2.tar.gz Importing: /spin1/home/linux/user/.singularity/metadata/sha256:f6be9f4f6905406c1e7fd6031ee3104d25ad6a31d10d5e9192e7abf7a21e519a.tar.gz WARNING: Building container as an unprivileged user. If you run this container as root WARNING: it may be missing some functionality. Building Singularity image... Singularity container built: ./ubuntu-latest.img Cleaning up... [user@cn0123 ~]$ . /usr/local/current/singularity/app_conf/sing_binds [user@cn0123 ~]$ singularity shell ubuntu.img WARNING: Bind file source does not exist on host: /etc/resolv.conf Singularity: Invoking an interactive shell within container... Singularity ubuntu.img:~> cat /etc/os-release NAME="Ubuntu" VERSION="16.04.3 LTS (Xenial Xerus)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 16.04.3 LTS" VERSION_ID="16.04" HOME_URL="http://www.ubuntu.com/" SUPPORT_URL="http://help.ubuntu.com/" BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/" VERSION_CODENAME=xenial UBUNTU_CODENAME=xenialThis feature gives you instant access to 100,000+ pre-built container images. You can even use a Docker Hub container as a starting point in a definition file.
In this example, we will create a Singularity container image starting from the official continuumio miniconda container on Docker Hub. Then we'll install a number of RNASeq tools. This would allow us to write a pipeline with, for example, Snakemake and distribute it along with the image to create an easily shared, reproducible workflow. This definition file also installs a runscript enabling us to treat our container like an executable.
BootStrap: docker From: continuumio/miniconda:latest IncludeCmd: yes %post # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # this will install all necessary packages and prepare the container apt-get -y update apt-get -y install make gcc zlib1g-dev libncurses5-dev wget https://github.com/samtools/samtools/releases/download/1.3.1/samtools-1.3.1.tar.bz2 \ && tar -xjf samtools-1.3.1.tar.bz2 \ && cd samtools-1.3.1 \ && make \ && make prefix=/usr/local install export PATH=/opt/conda/bin:$PATH conda install --yes -c bioconda \ star=2.5.2b \ sailfish=0.10.1 \ fastqc=0.11.5 \ kallisto=0.43.0 \ subread=1.5.0.post3 conda clean --index-cache --tarballs --packages --yes mkdir /data /resources %runscript # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # this text code will run whenever the container # is called as an executable or with `singularity run` function usage() { cat <<EOF NAME rnaseq - rnaseq pipeline tools 0.1 SYNOPSIS rnaseq tool [tool options] rnaseq list rnaseq help DESCRIPTION Singularity container with tools to build rnaseq pipeline. EOF } function tools() { echo "conda: $(which conda)" echo "---------------------------------------------------------------" conda list echo "---------------------------------------------------------------" echo "samtools: $(samtools --version | head -n1)" } arg="${1:-none}" case "$arg" in none) usage; exit 1;; help) usage; exit 0;; list) tools; exit 0;; # just try to execute it then *) $@;; esac %environment # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # This sets global environment variables for anything run within the container export PATH="/opt/conda/bin:/usr/local/bin:/usr/bin:/bin:" unset CONDA_DEFAULT_ENV export ANACONDA_HOME=/opt/conda
Assuming this file is called rnaseq.def, we can create a Singularity container called rnaseq on our build system with the following commands:
[user@some_build_system ~]$ sudo singularity build rnaseq rnaseq.def
This image contains miniconda and our rnaseq tools and can be called directly as an executable like so:
[user@some_build_system ~]$ ./rnaseq help NAME rnaseq - rnaseq pipeline tools 0.1 SYNOPSIS rnaseq snakemake [snakemake options] rnaseq list rnaseq help DESCRIPTION Singularity container with tools to build rnaseq pipeline. [user@some_build_system ~]$ ./rnaseq list conda: /opt/conda/bin/conda --------------------------------------------------------------- # packages in environment at /opt/conda: # fastqc 0.11.5 1 bioconda java-jdk 8.0.92 1 bioconda kallisto 0.43.0 1 bioconda sailfish 0.10.1 boost1.60_1 bioconda [...snip...] [user@some_build_system ~]$ ./rnaseq samtools --version samtools 1.3.1 Using htslib 1.3.1 Copyright (C) 2016 Genome Research Ltd.
After copying the image to the NIH HPC systems, allocate an sinteractive session and test it there
[user@cn1234 ~]$ module load singularity [user@cn1234 ~]$ ./rnaseq list conda: /opt/conda/bin/conda --------------------------------------------------------------- # packages in environment at /opt/conda: # fastqc 0.11.5 1 bioconda java-jdk 8.0.92 1 bioconda kallisto 0.43.0 1 bioconda sailfish 0.10.1 boost1.60_1 bioconda [...snip...]
This could be used with a Snakemake file like this
rule fastqc: input: "{sample}.fq.gz" output: "{sample}.fastqc.html" shell: """ module load singularity ./rnaseq fastqc ... {input} """ rule align: input: "{sample}.fq.gz" output: "{sample}.bam" shell: """ module load singularity ./rnaseq STAR .... """
Expand the tab below to see an example of creating a Singularity container to be used as an executable from a Docker image on DockerHub.
A few containers have caused issues on Biowulf by triggering a kernel level bug described in detail here and here. These include fmriprep and nanodisco. The problems follow a predictable pattern: