Search Docs by Keyword

Table of Contents

OpenMP Software on the FASRC cluster

Introduction

OpenMP, or Open Multi-Processing, is an application programming interface (API) for shared-memory parallel programming for C, C++, and Fortran. It allows developers to create parallel applications that can utilize multiple processing cores and GPUs on a single node. OpenMP simplifies the process of parallel programming by allowing developers to incrementally parallelize existing serial code through the addition of compiler directives and runtime library routines. Parallelism can then be enabled at compile-time by specifying a command-line option, and tuned at run-time (or disabled) by setting specific environment variables.

Using OpenMP on the FASRC Cluster

  • Supported compilers are provided by the “intel” and “gcc” environment modules.
  • Ensure that the same compiler environment module that was used at compile-time is loaded before running the application. This ensures the same version of the OpenMP library is dynamically linked and used at run-time.
  • The environment variable OMP_NUM_THREADS=1 by default.

Example Code

Below are simple OpenMP example codes in both Fortran and C++.

In the examples:

  1. Each thread prints its thread number in a critical region, which enforces mutual exclusion (i.e., can be executed by only one thread at a time), to avoid mangling concurrent output.
  2. All threads wait until all other threads have encountered the barrier.
  3. A single thread outputs the number of threads executing the parallel region.

Fortran:

!=====================================================================
! Program: omp_test.f90
!=====================================================================
program omp_test
  use omp_lib
  implicit none
!$omp parallel
!$omp critical
  write(*,*) "Thread number:", omp_get_thread_num()
!$omp end critical
!$omp barrier
!$omp single
  write(*,*) "Number of threads = ", omp_get_num_threads()
!$omp end single
!$omp end parallel
end program

C++:

//====================================================================
// Program: omp_test.cpp
//====================================================================
#include <iostream>
#include <omp.h>
using namespace std;
int main()
{
#pragma omp parallel
  {
    #pragma omp critical
    cout << "Thread number: " << omp_get_thread_num() << endl;
    #pragma omp barrier
    #pragma omp single
    cout << "Number of threads = " << omp_get_num_threads() << endl;
  }
}

Compiling the program

Intel Fortran:

[username@rclogin02 ~]$ module load intel/25.0.1-fasrc01
[username@rclogin02 ~]$ ifx -o omp_test.x omp_test.f90 -qopenmp

Intel C++:

[username@rclogin02 ~]$ module load intel/25.0.1-fasrc01
[username@rclogin02 ~]$ icpx -o omp_test.x omp_test.cpp -qopenmp

GNU Fortran:

[username@rclogin02 ~]$ module load gcc/14.2.0-fasrc01
[username@rclogin02 ~]$ gfortran -o omp_test.x omp_test.f90 -fopenmp

GNU C++:

[username@rclogin02 ~]$ module load gcc/14.2.0-fasrc01
[username@rclogin02 ~]$ g++ -o omp_test.x omp_test.cpp -fopenmp

Running the program

The following SLURM batch-job submission script can be used to submit the job to the queue:

#!/bin/bash
#SBATCH -J omp_test
#SBATCH -o omp_test.out
#SBATCH -p test
#SBATCH -t 10
#SBATCH --mem=1750
#SBATCH -c 8
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
module load gcc/14.2.0-fasrc01   # replace with "intel/<version>" if compiled using intel compiler
srun -c $SLURM_CPUS_PER_TASK ./omp_test.x

The OMP_NUM_THREADS environmental variable is used to set the number of CPUs allocated to the job (by the Slurm #SBATCH -c 8 directive). If you name the above script omp_test.batch, for instance, the job is submitted to the queue with

sbatch omp_test.batch

References

© The President and Fellows of Harvard College
Except where otherwise noted, this content is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.