Search Docs by Keyword

Table of Contents

R Parallel

Description

Here, we briefly explain different ways to use R in parallel on the FASRC Cannon cluster.  The best place for information on R Parallel is our training session:

Parallel computing may be necessary to speed up a code or to deal with large datasets. It can divide the workload into chunks and each worker (i.e. core) will take one chunk. The goal of using parallel computing is to reduce the total computational time by having each worker process its workload in parallel with other workers.

Usage

Request an interactive node

salloc -p test --time=0:30:00 --mem=4000

Load required software modules.

# Compiler, MPI, and R libraries
module load gcc/9.3.0-fasrc01 openmpi/4.0.5-fasrc01 R/4.0.5-fasrc02

Examples

User Codes has a summary of R parallel packages that can be used on Cannon. You can find a complete list of available packages at CRAN.

Processing large datasets

Single-node, multi-core (shared memory)

Multi-node, distributed memory

Hybrid: Multi-node + shared-memory

Using nested futures and package future.batchtools, we can perform a multi-node and multi-core job.

Resources

© The President and Fellows of Harvard College.
Except where otherwise noted, this content is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.