Search Docs by Keyword

Table of Contents

R – Packages with Singularity

This approach is useful when R packages have many dependencies and/or require additional software to be installed in the cluster (e.g. geojsonio, protobuf). We solve this by creating a Singularity image that takes care of the dependencies, software installs, and environment variables.

Copy an existing Singularity image

We have the Singularity image sing_biocond_3.14.sif that was created based on the Singularity definition file bioconductor_3.14.def with many of the geospatial binaries. You can copy the image to your $HOME directory:

cp /n/singularity_images/FAS/R/sing_biocond_3.14.sif $HOME

Installing R packages

After you have the Singularity container sing_biocond_3.14.sif, you have to install R packages inside the container:

# start a shell inside the Singularity container
$ singularity shell sing_biocond_3.14.sif

# set your library
# (note: this path should be different than if you have installed R packages directly on the cluster)
Singularity> mkdir -p $HOME/apps/R_Singularity/4.1.3
Singularity> export R_LIBS_USER=$HOME/apps/R_Singularity/4.1.3:$R_LIBS_USER

# start R shell
Singularity> R

# install packages inside R shell
> install.packages("sf")
Installing package into ‘/n/home01/jharvard/apps/R_Singularity/4.1.3’
(as ‘lib’ is unspecified)

... omitted output ...

* installing *binary* package ‘sf’ ...
* DONE (sf)

The downloaded source packages are in
‘/tmp/RtmpReEDf2/downloaded_packages’

# install INLA for R 4.1.3
> remotes::install_version("INLA", version="22.05.03",repos=c(getOption("repos"),INLA="https://inla.r-inla-download.org/R/testing"), dep=TRUE)

# check INLA works
> library(INLA)
Loading required package: Matrix
Loading required package: foreach
Loading required package: parallel
Loading required package: sp
This is INLA_22.05.03 built 2022-05-03 07:58:22 UTC.
- See www.r-inla.org/contact-us for how to get help.
- To enable PARDISO sparse library; see inla.pardiso()

Running R within the Singularity container

To run a R file using the Singularity container, you can use the command singularity exec which runs a command within a container. The syntax is

singularity exec [exec options...] <container> <command>

where <container> is the sing_biocond_3.14.sif image and <command> is R CMD BATCH.

Following the example from R-Basics, this is how we would run the same example within a Singularity container:

singularity exec sing_biocond_3.14.sif R CMD BATCH --no-save --no-restore '--args a=1 b=c(2,5,6)' test.R test.out

where test.R contains:

##First read in the arguments listed at the command line
args=(commandArgs(TRUE))
##args is now a list of character vectors
## First check to see if arguments are passed.
## Then cycle through each element of the list and evaluate the expressions.
if(length(args)==0){
    print("No arguments supplied.")
    ##supply default values
    a = 1
    b = c(1,1,1)
}else{
    for(i in 1:length(args)){
        eval(parse(text=args[[i]]))
    }
}

print(a)
print(b)

The output file test.out should have the following lines in it:

> print (a)
[1] 1
> print (b)
[1] 2 5 6

Running R within the Singularity container as SLURM jobs

Combining the examples from R-Basics and how to use Singularity container as SLURM jobs, the submit file singularity_R.batch specifies the job requirements:

#!/bin/bash
#SBATCH -J singularity_test
#SBATCH -o singularity_test.out
#SBATCH -e singularity_test.err
#SBATCH -p test
#SBATCH -t 0-00:10
#SBATCH -c 1
#SBATCH --mem=4000

# Singularity command line options
singularity exec sing_biocond_3.14.sif R CMD BATCH --no-save --no-restore '--args a=1 b=c(2,5,6)' test.R test.out

To run as a batch job, submit using sbatch:

[user@holylogin01 ~]$ sbatch singularity.sbatch

Building your custom Singularity image

If the image we provided does not include certain R packages that you need, you can add your software installation in the Singularity definition file as explained below.

Prerequisite: Create a Sylabs account

Singularity requires root access to build a Singularity image. Root access is not allowed in FASRC clusters. Sylabs cloud provides a free service where you can build a container.

To create a free Sylabs cloud account:

  1. Go to https://cloud.sylabs.io/library
  2. Click “Sign in” on the top right corner
  3. Select your method to sign in, with Google, GitLab, HitHub, or Microsoft

Prerequisite: Create a Sylabs access token

To access Sylabs cloud, you need an access token. To create a token follow these steps:

  1. Go to: https://cloud.sylabs.io/
  2. Click “Sign In” and follow the sign in steps.
  3. Click on your login idon the top right corner
  4. Select “Access Tokens” from the drop down menu.
  5. Enter a name for your new access token, such as “Cannon token”.
  6. Click the “Create a New Access Token” grey button.

Prerequisite: Singularity definition file

In order to build the Singularity container, you will need to copy and edit the definition file bioconductor_3.14.def to Cannon. Add your custom software installs under the %post header:

%post
    apt-get update
    apt-get install -y netcat

For more details, refer to the Singularity definition file documentation.

Build a Singularity container

On the cluster, follow these steps:

# request an interactive node to use a compute node
$ salloc -p test --time=2:00:00 --mem=4000

# make sure you are in the directory where you copied bioconductor_3.14.def to
$ ls -l
total 2660840
-rw-r--r-- 1 jharvard jharvard_lab 2178 May 26 13:18 bioconductor_3.14.def

# login to Sylabs cloud: you will have to paste your copied token after the prompt
$ singularity remote login
Generate an access token at https://cloud.sylabs.io/auth/tokens, and paste it here.
Token entered will be hidden for security.
Access Token:
INFO: Access Token Verified!
INFO: Token stored in /n/home01/jharvard/.singularity/remote.yaml

# build the container -- this takes about 30-40 min!
$ singularity build --remote sing_biocond_3.14.sif bioconductor_3.14.def
INFO: Remote "cloud.sylabs.io" added.
INFO: Access Token Verified!
INFO: Token stored in /root/.singularity/remote.yaml
INFO: Remote "cloud.sylabs.io" now in use.
INFO: Starting build...
Getting image source signatures
Copying blob sha256:df3a04b8ec6dab5747870af5fb7e191b04a1ffb4c08112f558aa9ba030f826a2
Copying blob sha256:7c3b88808835aa80f1ef7f03083c5ae781d0f44e644537cd72de4ce6c5e62e00
Copying blob sha256:5e1cc1bb0e8627b4acb34e0fbdb6cec1a994d83f2c48dfe4874ac59c05e7127b

... omitted output ...

INFO: Adding labels
INFO: Adding environment to container
INFO: Creating SIF file...                         ## this step takes a while
INFO: Build complete: /tmp/image-3184544461
WARNING: Skipping container verification
INFO: Uploading 2416373760 bytes
INFO: Build complete: sing_biocond_3.14.sif
© The President and Fellows of Harvard College
Except where otherwise noted, this content is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.