Page tree

On this page

Overview

PAPI stands for Performance Application Programming Interface. It provides the tool designer and application engineer with a consistent interface and methodology for use of the performance counter hardware found in most major microprocessors. It enables software engineers to see, in near real time, the relation between software performance and processor events. In addition, Component PAPI provides access to a collection of components that expose performance measurement opportunities across the hardware and software stack.

More information: http://icl.cs.utk.edu/papi/

Usage

You can check the versions installed in Gadi with a module query:

$ module avail papi

We normally recommend using the latest version available and always recommend to specify the version number with the module command:

$ module load papi/5.7.0

For more details on using modules see our modules help guide at https://opus.nci.org.au/display/Help/Environment+Modules.

Instrumentation of Program

PAPI requires user instrumentation of the program; to this end include files and PAPI function calls must be inserted in the subroutines which are to be measured.

  • For C, please include the file papi.h
  • For Fortran 77, please include the file f77papi.h
  • For Fortran 90, please include the file f90papi.h
  • If you intend to preprocess your Fortran code, you may use the file fpapi.h

Then build and link the objects using PAPI:

# Load modules, always specify version number.
$ module load openmpi/4.0.2
$ module load papi/5.7.0

$ mpicc <Compiler Options> $CPATH -o mpi_program.o mpi_program.c
$ mpicc <Linkage Options> -o mpi_program mpi_program.o $LD_LIBRARY_PATH

An example PBS job submission script named papi_job.sh is provided below. It requests 48 CPU cores, 128 GiB memory, and 400 GiB local disk on a compute node on Gadi from the normal queue for its exclusive access for 30 minutes against the project a00. It also requests the system to enter the working directory once the job is started. This script should be saved in the working directory from which the analysis will be done. To change the number of CPU cores, memory, or jobfs required, simply modify the appropriate PBS resource requests at the top of the job scrip files according to the information available at https://opus.nci.org.au/display/Help/Queue+Structure. Note that if your application does not work in parallel, setting the number of CPU cores to 1 and changing the memory and jobfs accordingly is required to prevent the compute resource waste.

#!/bin/bash
  
#PBS -P a00
#PBS -q normal
#PBS -l ncpus=48
#PBS -l mem=128GB
#PBS -l jobfs=400GB
#PBS -l walltime=00:30:00
#PBS -l wd
  
# Load modules, always specify version number.
module load openmpi/4.0.2
module load papi/5.7.0
  
# Must include `#PBS -l storage=scratch/ab12+gdata/yz98` if the job
# needs access to `/scratch/ab12/` and `/g/data/yz98/`. Details on:
# https://opus.nci.org.au/display/Help/PBS+Directives+Explained

# Run application
mpirun -np $PBS_NCPUS ./mpi_program

To run the job you would use the PBS command:

$ qsub papi_job.sh

The instrumentation results, if any, will be printed in the stdout, i.e., in the job output file papi_job.sh.o<jobID> when the job completes.

PAPI Utilities

Load PAPI module:

# Load module, always specify version number.
$ module load papi/5.7.0

Then list the contents of $PAPI_BASE/bin directory and consult the corresponding man pages or see page http://icl.cs.utk.edu/papi/docs/ for details.

PAPI documentation

PAPI Wiki: https://bitbucket.org/icl/papi/wiki/Home

How to Use PAPI: https://bitbucket.org/icl/papi/wiki/Using-PAPI.md

PAPI User Guide: http://icl.cs.utk.edu/projects/papi/files/documentation/PAPI_USER_GUIDE.htm