Conda is a cross-platform and language-independent user-level package manager. It is well-established in the bioinformatics community by virtue of the bioconda package repository.
See the personal software installation user guide for more about package managers and alternative approaches.
mamba_install
wrapper script below installs miniforge following our recommendations./tmp
.
If you do want to use helix, you can do so by setting TMPDIR
to a location in your data directory.RuntimeError: Multi-download failed. Reason: Transfer finalized, status: 403 [https://repo.anaconda.com/pkgs/r/noarch/repodata.json] 4020 bytesthen you can fix it for most cases by disabling the default channels and adding alternative channels. This can be done with by adding the following lines to a .condarc file at the top level of the miniconda install (i.e.
$CONDA_ROOT/.condarc
). You might also want to set strict channel
priority at the same time if you haven't already done so:
channels: - conda-forge - bioconda defaults: [] channel_priority: strictYou should also ensure that your
~/.condarc
does not explicitly set channels or
defaults. Alternatively follow the instructions below
to obtain an access token to take advantage of the NIH Anaconda Professional license.
dbus
package to your PATH in one of your startup files, NoMachine may fail with a black screen.
This can happen when you automatically activate your own conda installation therein.
The solution is to remove any unnecessary initialization from your shell startup file manually, comment out or remove the lines between conda init --reverse
, or remove the dbus
package from the environment.dbus
package to your PATH in one of your startup files, TurboVNC may fail to authenticate.
This can happen when you automatically activate your own conda installation therein.
The solution is to remove any unnecessary initialization from your shell startup file manually, comment out or remove the lines between conda init --reverse
, or remove the dbus
package from the environment.The mamba_install wrapper is a module to install and configure miniforge for you. Once that is installed you can easily initialize
your shell to use the install and set up your own environments.
conda init --reverse
.source
the init_file in your shell session. If the init_file already exists, this step is skipped. Some examples:
Install fresh conda env in /data/$USER/conda and generate conda init_file as ~/bin/myconda
mamba_installInstall fresh conda env in /data/$USER/mymamba and generate conda init_file as ~/bin/mymamba
mamba_install --init-file=~/bin/mymamba /data/$USER/mymambaRemove the "conda init" code from your shell startup file when conda env at /data/$USER/conda
mamba_install --cleanup-onlyAdd "conda init" to ~/bin/myminiconda for customized env
mamba_install --init-only --shell=bash --init-file=~/bin/myminiconda /data/$USER/miniconda/To load the mamba_install module in an interactive session:
[user@biowulf]$ sinteractive --mem=20g --gres=lscratch:20 [user@cn3444]$ module load mamba_install [user@cn3444]$ mamba_install -h NAME mamba_install - Install and configure mamba + conda forge SYNOPSIS mamba_install [OPTIONS] [mamba_directory] DESCRIPTION Installs conda/mamba following best practices for NIH HPC as described at https://hpc.nih.gov/docs/diy_installation/conda.html The default install location is /data/apptest2/conda and installs in the home directory are not allowed. Fails if the install directory exists already. By default creates an init_file with the code necessary to activate the conda install for your default shell. To activate source the init_file in your shell session. If the init_file already exists this step is skipped. If the init_file is created in a directory that is included in the path then the source command does not need to specify the whole path (e.g. 'source myconda' will work for the default location of the init file). Use this init file instead of allowing conda/mamba to modify your .bashrc/.zshrc/... to avoid automatic activation of a conda install which can result various hard to diagnose problems. By default this will remove any conda init code from all dotfiles. Use --no-cleanup if you don't want that. --no-init Do not create init_file --init-only Create init file for an existing mamba forge install and exit --no-cleanup Do not remove "conda init" code from all shell dotfiles --cleanup-only Remove "conda init" code from all shell dotfiles and exit --init-file=FILENAME Name of the init file. Defaults to '~/bin/myconda' --shell=SHELL Create init for SHELL instead of the your default shell. Allowed: bash, fish, tcsh, zsh, xonsh --debug Preserve temp dir EXAMPLES Install in /data/apptest2/conda mamba_install Install in /data/apptest2/mymamba mamba_install /data/apptest2/mymamba Install in /data/apptest2/conda and put the init file in a different path mamba_install --init-file=~/bin/mymamba Remove the "conda init" code from all shell init files mamba_install --cleanup-only Generate init file for customized conda env mamba_install --init-only --shell=bash --init-file=~/myminiconda /data/apptest2/miniconda/ MAINTAINERS Qi Yu and Wolfgang Resch For questions please email staff@hpc.nih.gov
In the following example, we will cover some basics of using conda to create private environments.
Option 1: Loading the wrapper script to install a fresh mamba on your data directory (default), or choose an alternative directory you wanted to install.
[user@biowulf]$ sinteractive --mem=20g --gres=lscratch:20 [user@cn3444]$ module load mamba_install [user@cn3444]$ mamba_install ... [user@cn3444]$ source myconda [user@cn3444]$ mamba --help
Option 2: Downloading the miniforge installer and install to a path in a data or shared directory like /data/$USER/conda.
[user@biowulf]$ sinteractive --mem=20g --gres=lscratch:20 ... [user@cn3444]$ cd /data/$USER [user@cn3444]$ export TMPDIR=/lscratch/$SLURM_JOB_ID
Download the miniforge installer and install to a path in a data or shared directory like
/data/$USER/conda
.
[user@cn3444]$ wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh --2022-04-01 11:13:41-- https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh [...snip...] Length: 92971376 (89M) [application/octet-stream] Saving to: ‘Miniforge3-Linux-x86_64.sh’ 100%[===================================================>] 92,971,376 111MB/s in 0.8s 2022-04-01 11:13:42 (111 MB/s) - ‘Miniforge3-Linux-x86_64.sh’ saved [92971376/92971376] [user@cn3444]$ bash Miniforge3-Linux-x86_64.sh -p /data/$USER/conda -b PREFIX=/data/$USER/conda Unpacking payload ... Extracting "python-3.9.10-h85951f9_2_cpython.tar.bz2" Extracting "_libgcc_mutex-0.1-conda_forge.tar.bz2" [...snip...] installation finished. [user@cn3444]$ rm Miniforge3-Linux-x86_64.sh
To use the newly installed conda you will have to source an init file. Do this each time you are going to work with your environment.
Do not allow conda/mamba to add automatic initialization to your startup files (e.g.
.bashrc
) as environments can interfere with login or NoMachine..
After sourcing the conda init file, activate the base environment and update the conda package manager which itself is just a package:
[user@cn3444]$ source /data/$USER/conda/etc/profile.d/conda.sh && source /data/$USER/conda/etc/profile.d/mamba.sh ### to make things easier you can create a file called `myconda` in a directory ### on your path such as ~/bin. This could be done like so (assuming the same ### paths as we used here). [user@cn3444]$ mkdir -p ~/bin ### this whole multi-line "heredoc" creates an activation script [user@cn3444]$ cat <<'__EOF__' > ~/bin/myconda __conda_setup="$('/data/$USER/conda/bin/conda' 'shell.bash' 'hook' 2> /dev/null)" if [ $? -eq 0 ]; then eval "$__conda_setup" else if [ -f "/data/$USER/conda/etc/profile.d/conda.sh" ]; then . "/data/$USER/conda/etc/profile.d/conda.sh" else export PATH="/data/$USER/conda/bin:$PATH" fi fi unset __conda_setup if [ -f "/data/$USER/conda/etc/profile.d/mamba.sh" ]; then . "/data/$USER/conda/etc/profile.d/mamba.sh" fi __EOF__ ### then from *anywhere* the miniforge install can be activated with [user@cn3444]$ source mycondaLet's not show the large mamba banner all the time
[user@cn3444]$ export MAMBA_NO_BANNER=1 [user@cn3444]$ mamba activate base (base) [user@cn3444]$ which python /data/$USER/conda/bin/python (base) [user@cn3444]$ mamba update --all Looking for: ['_libgcc_mutex', 'ca-certificates', 'ld_impl_linux-64', 'libstdcxx-ng', 'libgomp', '_openmp_mutex', 'libgcc-ng', 'yaml-cpp', 'yaml', 'xz', 'reproc', 'openssl', 'ncurses', 'lzo', 'lz4-c', 'libzlib', 'libuuid', 'libnsl', 'libiconv', 'libffi', 'libev', 'keyutils', 'icu', 'c-ares', 'bzip2', 'reproc-cpp', 'libedit', 'readline', 'zstd', 'zlib', 'tk', 'krb5', 'sqlite', 'libxml2', 'libssh2', 'libsolv', 'libnghttp2', 'libarchive', 'libcurl', 'libmamba', 'pybind11-abi', 'tzdata', 'python', 'python_abi', 'setuptools', 'wheel', 'pip', 'six', 'pycparser', 'idna', 'colorama', 'charset-normalizer', 'tqdm', 'ruamel_yaml', 'pysocks', 'pycosat', 'libmambapy', 'certifi', 'cffi', 'conda-package-handling', 'cryptography', 'brotlipy', 'pyopenssl', 'urllib3', 'requests', 'conda', 'mamba'] conda-forge/noarch 7.8MB @ 3.8MB/s 2.2s conda-forge/linux-64 21.9MB @ 3.6MB/s 6.6s Pinned packages: - python 3.9.* [...snip...] Change: 4 packages Upgrade: 5 packages Total download: 47MB ─────────────────────────────────────────────────────────────────────────────── Confirm changes: [Y/n] Y [...snip...] Preparing transaction: done Verifying transaction: done Executing transaction: done (base) [user@cn3444]$ mamba clean --all --yes Cache location: /data/$USER/conda/pkgs Will remove the following tarballs: /data/$USER/conda/pkgs ------------------------ python-3.9.5-h12debd9_4.tar.bz2 22.6 MB [...snip...] idna-3.2-pyhd3eb1b0_0.conda 48 KB --------------------------------------------------- Total: 59.4 MB Removed python-3.9.5-h12debd9_4.tar.bz2 [...snip...]
Now let's create a new environment called project1
with an older version of
pysam from the bioconda channel and python 3.7. For this we use mamba.
(base) [user@cn3444]$ mamba deactivate [user@cn3444]$ mamba create -n project1 python=3.7 numpy scipy bioconda::pysam==0.15.3 samtools==1.9 Looking for: ['python=3.7', 'numpy', 'scipy', 'bioconda::pysam==0.15.3', 'samtools==1.9'] bioconda/linux-64 4.1MB @ 3.8MB/s 1.2s bioconda/noarch 3.5MB @ 2.8MB/s 1.3s conda-forge/noarch 7.8MB @ 3.9MB/s 2.2s conda-forge/linux-64 21.9MB @ 3.9MB/s 6.2s Transaction Prefix: /data/$USER/conda/envs/project1 Updating specs: - python=3.7 - numpy - scipy - bioconda::pysam==0.15.3 - samtools==1.9 [...snip...] Confirm changes: [Y/n] Y [...snip...] Preparing transaction: done Verifying transaction: done Executing transaction: done # # To activate this environment, use # # $ mamba activate project1 # # To deactivate an active environment, use # # $ mamba deactivate [user@cn3444]$ mamba activate project1 (project1) [user@cn3444]$ which python /data/$USER/conda/envs/project1/bin/python (project1) [user@cn3444]$ samtools --version samtools 1.9 Using htslib 1.9 Copyright (C) 2018 Genome Research Ltd. (project1) [user@cn3444]$ mamba deactivate [user@cn3444]$
Now an environment for a different project with current pysam, some other
tools, and numpy using the OpenBlas numerical libraries. This time we add the
bioconda
channel to the channels for the environment so we don't
have to use the bioconda::
prefix. A common pattern for
environments used for bioinformatic software is to set up bioconda and
conda-forge channels on a per-environment basis. This allows conda-forge
packages to override packages from the defaults channel. We also specify
MKL for the numerical libraries and pin that so it won't change accidentally.
Note that at this point mamba
does not yet include the config
command.
[user@cn3444 temp]$ mamba create -n project2 python=3.8 [...snip...] [user@cn3444]$ mamba activate project2 (project2) [user@cn3444]$ conda config --env --add channels bioconda (project2) [user@cn3444]$ conda config --env --add channels conda-forge Warning: 'conda-forge' already in 'channels' list, moving to the top (project2) [user@cn3444]$ conda config --env --set channel_priority strict (project2) [user@cn3444]$ conda config --env --add pinned_packages blas=*=mkl (project2) [user@cn3444]$ conda config --show-sources ==> /data/$USER/conda/.condarc <== channels: - conda-forge ==> /data/$USER/envs/project2/.condarc <== pinned_packages: - libblas=*=*_mkl - python=3.8 channel_priority: strict channels: - conda-forge - bioconda (project2) [user@cn3444]$ mamba install -q pysam bedtools hisat2 blas numpy scipy [...snip...] Preparing transaction: ...working... done Verifying transaction: ...working... done Executing transaction: ...working... done (project2) [user@cn3444]$ which python /data/$USER/conda/envs/project2/bin/python (project2) [user@cn3444]$ mamba install tensorflow=*=cuda* [...snip...]
Note that pip
can be used to install packages into conda environments as
well. However, this can sometimes cause problems when pip overwrites existing
conda-installed packages.
List environments in the current conda install, then deactivate environments
(project2) [user@cn3444]$ mamba info --env # conda environments: # base /data/$USER/conda project1 /data/$USER/conda/envs/project1 project2 * /data/$USER/conda/envs/project2 (project2) [user@cn3444]$ mamba deactivate [user@cn3444]$
Re-install base conda. In a rare case, there is "start from fresh" solution:
[user@cn3444]$ mv /data/$USER/conda /data/$USER/conda_backup [user@cn3444]$ ml mamba_install [user@cn3444]$ mamba_install ..... [user@cn3444]$ mv /data/$USER/conda_backup/envs /data/$USER/conda [user@cn3444]$ source myconda [user@cn3444]$ conda info --envs
All NIH staff are eligible to receive a no-cost, business-tier Anaconda license within the NIH Anaconda organization. You can use this form to request a license from the NIH Center for Information Technology.
The benefits of getting a license through NIH include:
How do you get set up? After you submit the account request form, look for an email with instructions for configuring your command-line and desktop conda installs to use the license. Below is an example of how to configure a fresh mamba-forge install to use the license:
[user@cn3444]$ ## activate your install if it isn't already [user@cn3444]$ source myconda [user@cn3444]$ conda install --yes conda-token -c Anaconda -n base [user@cn3444]$ conda token set --system "REDACTED_TOKEN" [user@cn3444]$ ## set up channel order according to bioconda recommendations [user@cn3444]$ conda config --system --add channels defaults [user@cn3444]$ conda config --system --add channels bioconda [user@cn3444]$ conda config --system --add channels conda-forge [user@cn3444]$ ## create an environment with just the defaults channel without changing the system config file [user@cn3444]$ mamba create -n testdefault [user@cn3444]$ mamba activate testdefault [user@cn3444](testdefault)$ conda config --env --remove --channels bioconda [user@cn3444](testdefault)$ conda config --env --remove --channels conda-forge [user@cn3444](testdefault)$ conda config --show-sources ==> /data/user/conda/.condarc <== add_anaconda_token: True channel_priority: strict channels: - conda-forge - bioconda - defaults default_channels: - https://repo.anaconda.cloud/repo/main - https://repo.anaconda.cloud/repo/r - https://repo.anaconda.cloud/repo/msys2 restore_free_channel: False ==> /data/user/conda/envs/testdefault/.condarc <== add_anaconda_token: True channel_priority: strict channels: - defaults default_channels: - https://repo.anaconda.cloud/repo/main - https://repo.anaconda.cloud/repo/r - https://repo.anaconda.cloud/repo/msys2 restore_free_channel: False [user@cn3444]$ ## this will now install only packages from the defaults channels [user@cn3444](testdefault)$ mamba install python=3.11 blas=*=mkl numpy pandas
To learn more about Anaconda at NIH, visit this offering overview or contact Anaconda@nih.gov.