Page tree

On this page

Overview

MPI (Message Passing Interface) is a parallel program interface for explicitly passing messages between parallel processes. To take advantage of MPI, you must add message passing constructs to your program.

More information: https://www.mcs.anl.gov/research/projects/mpi/

Usage

Both Open MPI and Intel MPI are supported on Gadi.

You can check the versions installed in Gadi with a module query:

$ module avail openmpi

or

$ module avail intel-mpi

We normally recommend using the latest version available and always recommend to specify the version number with the module command:

$ module load openmpi/4.1.1

or

$ module load intel-mpi/2021.3.0

For more details on using modules see our modules help guide at https://opus.nci.org.au/display/Help/Environment+Modules.

Compiling and Linking

Follow the link https://opus.nci.org.au/display/Help/Compiling+and+Linking#CompilingandLinking-UsingMPI for information about how to create a binary executable of your MPI-enabled application using MPI libraries.

Open MPI Job

An example PBS job submission script named mpi_job.sh is provided below. It requests 48 CPU cores, 128 GiB memory, and 400 GiB local disk on a compute node on Gadi from the normal queue for its exclusive access for 30 minutes against the project a00. It also requests the system to enter the working directory once the job is started. This script should be saved in the working directory from which the analysis will be done. To change the number of CPU cores, memory, or jobfs required, simply modify the appropriate PBS resource requests at the top of the job scrip files according to the information available at https://opus.nci.org.au/display/Help/Queue+Structure. Note that if your application does not work in parallel, setting the number of CPU cores to 1 and changing the memory and jobfs accordingly is required to prevent the compute resource waste.

#!/bin/bash

#PBS -P a00
#PBS -q normal
#PBS -l ncpus=48
#PBS -l mem=128GB
#PBS -l jobfs=400GB
#PBS -l walltime=00:30:00
#PBS -l wd

# Load module, always specify version number.
module load openmpi/4.1.1

# Must include `#PBS -l storage=scratch/ab12+gdata/yz98` if the job
# needs access to `/scratch/ab12/` and `/g/data/yz98/`. Details on:
# https://opus.nci.org.au/display/Help/PBS+Directives+Explained

# Run application
mpirun -np $PBS_NCPUS <your MPI exe>

Intel MPI Job

An example PBS job submission script named mpi_job.sh is provided below. It requests 48 CPU cores, 128 GiB memory, and 400 GiB local disk on a compute node on Gadi from the normal queue for its exclusive access for 30 minutes against the project a00. It also requests the system to enter the working directory once the job is started. This script should be saved in the working directory from which the analysis will be done. To change the number of CPU cores, memory, or jobfs required, simply modify the appropriate PBS resource requests at the top of the job scrip files according to the information available at https://opus.nci.org.au/display/Help/Queue+Structure. Note that if your application does not work in parallel, setting the number of CPU cores to 1 and changing the memory and jobfs accordingly is required to prevent the compute resource waste.

#!/bin/bash

#PBS -P a00
#PBS -q normal
#PBS -l ncpus=48
#PBS -l mem=128GB
#PBS -l jobfs=400GB
#PBS -l walltime=00:30:00
#PBS -l wd

# Load module, always specify version number.
module load intel-mpi/2021.3.0

# Must include `#PBS -l storage=scratch/ab12+gdata/yz98` if the job
# needs access to `/scratch/ab12/` and `/g/data/yz98/`. Details on:
# https://opus.nci.org.au/display/Help/PBS+Directives+Explained

export I_MPI_HYDRA_BRANCH_COUNT=$(($PBS_NCPUS / $PBS_NCI_NCPUS_PER_NODE))

# Run application
mpirun -np $PBS_NCPUS <your MPI exe>

The I_MPI_HYDRA_BRANCH_COUNT variable sets the number of nodes for the Intel MPI.

Submit and Run Job

To submit and run the job you would use the PBS command:

$ qsub mpi_job.sh

Hybrid MPI and OpenMP

Many modern program can run in so called hybrid mode, i.e. an MPI communication is done between the nodes (or physical CPUs) and OMP threads are used inside nodes (or physical CPUs). This may allow you to get a better performance. If you have a program that support such a hybrid mode, follow the links below for more information: