Page tree

On this page

Overview

Darshan is a scalable, lightweight I/O profiler tool developed by Argonne National Laboratory for High Performance Computing (HPC).

Darshan is designed to capture an accurate picture of application I/O behaviour, including properties such as patterns of access within files, with minimum overhead. It is able to collect profile information for POSIX, HDF5, NetCDF and MPI-IO calls. Its profile data can be used to investigate and tune I/O behavior of MPI applications.

Darshan can be used only with MPI applications. The application, at minimum, must call MPI_Init and MPI_Finalize.

More information: https://www.mcs.anl.gov/research/projects/darshan/

Usage

You can check the versions installed in Gadi with a module query:

$ module avail darshan

We normally recommend using the latest version available and always recommend to specify the version number with the module command:

$ module load darshan/3.2.1

For more details on using modules see our modules help guide at https://opus.nci.org.au/display/Help/Environment+Modules.

It has been built against different versions of Open MPI and Intel MPI. If an Open MPI module or Intel MPI module is in your environment when the darshan module is loaded, it will automatically detect the correct library to preload. A warning will be issued if no MPI module is detected, and runtime profiling will not work unless the LD_PRELOAD  environment variable is set manually. The utilities that can be used to analyse Darshan profiles, as well as compile time profiling will still be available. When the mpicc.darshan , mpicxx.darshan  or mpif90.darshan  compiler wrappers are used, the darshan library matching the version of MPI in use will be linked correctly at build time. Darshan can also be used to profile non-MPI enabled applications, however the environment variable DARSHAN_ENABLE_NONMPI=1  must be set, otherwise the application will likely fail and no profile will be generated. It supports HDF5 profiling, so can be used to give an insight into more complex IO patterns that standard POSIX IO semantics. By default, profiling data will be stored in your /scratch/$PROJECT/$USER/darshan  directory, which the module will create for you the first time you load it. This can be changed at any time by setting the DARSHAN_LOGDIR  environment variable.

There are two complementary methods of using Darshan to profile the IO in MPI applications: compile-time instrumentation and runtime instrumentation.

Compile-time Instrumentation

Darshan can be added to an application at compile-time by using the mpicc.darshan , mpif90.darshan  or mpicxx.darshan  wrappers. for C, Fortran and C++ applications respectively. It does so by inserting the libdarshan.so  shared library before any other shared library specified on the command line to the linker. It will also link in the  libhdf5.so  shared library, as otherwise your application will not compile due the HDF5 symbols present in libdarshan.so  required for HDF5 profiling. This is required whether your application uses HDF5 or not. Once this is done, you do not need to load the darshan module in order to generate a profile for your application at runtime, though you will need to set the DARSHAN_LOGDIR  environment variable manually. Note that compile time instrumentation is not supported for non-MPI applications. Using the Darshan wrappers on a non-MPI applications will not result in an application with the Darshan profiling library linked, and thus will not generate profiles. Runtime instrumentation is required to profile non-MPI applications. Below is an example of building an application with Darshan compile-time instrumentation, then running it later in a PBS job.

Example build script:

# Load modules, always specify version number.
# Note, load darshan before MPI, we do not want LD_PRELOAD set for builds.
module load darshan/3.2.1
module load intel-compiler/2020.2.254
module load openmpi/4.0.2

make CC=mpicc.darshan CXX=mpicxx.darshan FC=mpif90.darshan

Example PBS job script:

#!/bin/bash

#PBS -P a00
#PBS -q normal
#PBS -l ncpus=48
#PBS -l mem=128GB
#PBS -l jobfs=400GB
#PBS -l walltime=00:30:00
#PBS -l wd

# Load module, always specify version number.
module load openmpi/4.0.2

# Must include `#PBS -l storage=scratch/ab12+gdata/yz98` if the job
# needs access to `/scratch/ab12/` and `/g/data/yz98/`. Details on:
# https://opus.nci.org.au/display/Help/PBS+Directives+Explained

# Set DARSHAN_LOGDIR
mkdir logdir
export DARSHAN_LOGDIR=./logdir

# Run application
mpirun -np $PBS_NCPUS <your Darshan linked MPI exe>

Runtime Instrumentation

Darshan can be used to profile an application that has already been built simply by loading the darshan module immediately before running your application.

Example PBS job script:

#!/bin/bash

#PBS -P a00
#PBS -q normal
#PBS -l ncpus=48
#PBS -l mem=128GB
#PBS -l jobfs=400GB
#PBS -l walltime=00:30:00
#PBS -l wd

# Load modules, always specify version number.
module load openmpi/4.0.2
# Note, load darshan after MPI, we need it to detect the correct library to LD_PRELOAD from the environment.
module load darshan/3.2.1

# Must include `#PBS -l storage=scratch/ab12+gdata/yz98` if the job
# needs access to `/scratch/ab12/` and `/g/data/yz98/`. Details on:
# https://opus.nci.org.au/display/Help/PBS+Directives+Explained

# Set DARSHAN_LOGDIR
mkdir logdir
export DARSHAN_LOGDIR=./logdir

# Run application
mpirun -np $PBS_NCPUS <your non-Darshan linked existing MPI exe>

The above two job scripts request 48 CPU cores, 128 GiB memory, and 400 GiB local disk on a compute node on Gadi from the normal queue for its exclusive access for 30 minutes against the project a00. They also request the system to enter the working directory once the job is started. These scripts should be saved in the working directory from which the analysis will be done. To change the number of CPU cores, memory, or jobfs required, simply modify the appropriate PBS resource requests at the top of the job scrip files according to the information available at https://opus.nci.org.au/display/Help/Queue+Structure. Note that if your application does not work in parallel, setting the number of CPU cores to 1 and changing the memory and jobfs accordingly is required to prevent the compute resource waste.

Another example for profiling non-MPI applications:

# Load module, always specify version number.
# This will issue a warning as it could not detect MPI.
$ module load darshan/3.2.1

$ export DARSHAN_ENABLE_NONMPI=1

# Any of the 3 symlinked libraries will work for this variable as we are not using MPI
$ export LD_PRELOAD=${DARSHAN_ROOT}/lib/libdarshan_ompi3.so

# Set DARSHAN_LOGDIR
mkdir logdir
export DARSHAN_LOGDIR=./logdir

# Run application
$ <your non-MPI exe>

Note that it is not advisable to set the LD_PRELOAD  environment variable for anything other than the application you wish to profile, as this can have unintended side effects on standard system calls. This is because the functions defined in the library specified in LD_PRELOAD override any functions of the same name in system libraries. This is particularly important for Darshan, as it overrides key system functions such as open  and read . If you load the darshan module before absolutely necessary in your login sessions or your PBS jobs, you may experience instability and unintended side effects.

Generate PDF Summary Report from Logs

# Load module, always specify version number.
# This will issue a warning as it could not detect MPI.
$ module load darshan/3.2.1

# Generate PDF summary report
$ darshan-job-summary.pl ./logdir/<UserID_ExecutableName_idJobID>***.darshan

View PDF Summary Report

Login to Gadi with X11 (X-Windows) forwarding. Add the -Y option for Linux/Mac/Unix to your SSH command to request SSH to forward the X11 connection to your local computer. For Windows, we recommend to use MobaXterm (http://mobaxterm.mobatek.net) as it automatically uses X11 forwarding.

# View PDF summary report
$ evince <UserID_ExecutableName_idJobID>***.darshan.pdf &

Known Issues

Darshan was primarily developed with the MPICH MPI implementation in mind, and as such, is not thoroughly tested with Open MPI. So far, we have found that the romio MPI-IO implementation packaged in Open MPI is not compatible with Darshan. When loaded, if the darshan module detects Open MPI in the environment, it will automatically set the environment variable OMPI_MCA_io=ompio , which will have Open MPI use its own MPI-IO implementation. Unfortunately, this has been demonstrated to be unstable in some circumstances in earlier versions of Open MPI, so a profile may not be generated in some cases when using Open MPI 3.1.4 or earlier.