Install R on an HPC cluster using bioconda

Maintain a consistent R environment across HPC environments with conda.

Install conda

We recommend installing conda with either Anaconda or Miniconda.

Once conda is installed, upgrade the conda base installation.

conda update --name="base" --channel="defaults" conda
conda update --name="base" --channel="defaults" --all

bash configuration

Up to v4.3, the location of the bin directory should be into $PATH in ~/.bash_profile:

export PATH="$CONDA_DIR/bin:$PATH"

As of the v4.4 update, the loading configuration has changed. Now a profile script must be sourced in ~/.bash_profile:

source "$CONDA_DIR/etc/profile.d/"

Set up channels

Ensure that bioconda channels are added in the following order:

conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge

A ~/.condarc file should be created that contains the following:

  - conda-forge
  - bioconda
  - defaults

Manging environments

A list of installed conda environments can be obtained with:

conda env list

A conda environment can be deactivated with:

conda deactivate

Here’s how to remove an environment:

conda env remove --name="R-3.5.1-YYYYMMDD"

Create R 3.5.1 environment

Note that pandoc version 2 currently creates issues rendering R Markdown templates properly.

conda create --name="R-3.5.1-YYYYMMDD" \
    blas \
    emacs \
    gcc \
    hdf5=1.10.1 \
    java-jdk \
    libgfortran \
    libiconv \
    mysql \
    openblas \
    pandoc=1 \
    r-base=3.5.1 \
    tmux \
    umap-learn \
mkdir -p ~/R/x86_64-pc-linux-gnu-library/3.5/bioc-3.8-release-YYYYMMDD

Create ~/.Renviron and ~/.Rprofile files, using the recommended defaults from seqcloud.

Now let’s activate the conda environment and run R.

conda activate R-3.5.1-YYYYMMDD