Finding memory problems in code can be a difficult task but there are tools available on the National Facility NCI machines to make it possible. Memory errors can arise in many ways such as referencing arrays out of their declared bounds, failing to allocate dynamic arrays, attempting to free memory that cannot be freed or memory leaks leading to an increase in the memory requirements of the running program.
-check bounds
to pinpoint any array subscript or substring references that are out of declared bounds at runtime.malloc()
are overrun or where there is an attempt to touch a memory allocation that has already been freed. Although the Electric Fence documentation will say that it can be used to debug MPI codes this does not work on the AC. It appears that the start-up procedure for the executable linked with -lefence
invokes some memory allocations that interfere with the SGI mpirun start-up. Code must be linked with the libefence.a library before being executed.valgrind is part of the system software on Raijin and is very useful for both Fortran and C code. It does not require linking to a library but simply enter
language | bash |
---|
The Valgrind tool Memcheck checks all reads and writes of memory and calls to malloc/new/free/delete are intercepted.
Mudflap is part of the more recent versions of gcc/g++
. To use mudflap you need to do at least
Code Block | ||
---|---|---|
| ||
module$ load gcc/4.0.1 gcc prog.c -fmudflap -lmudflap $ a.out |
The environment variable MUDFLAP_OPTIONS
can be used to control the output from mudflap. See http://gcc.gnu.org/wiki/Mudflap_Pointer_Debugging for more details.
Raijin Totalview is installed on Raijin TotalView (https://opus.nci.org.au/display/Help/TotalView) is installed on Gadi and it can be used to debug sequential, MPI or OpenMP programs written in C, C++ or Fortran. In order to use the memory debugging tools of Totalview TotalView, you need to login to Gadi with X11 (X-Windows) forwarding. Add the -Y
option for Linux/Mac/Unix to your SSH command to request SSH to forward the X11 connection to your local computer. For Windows, we recommend to use MobaXterm (http://mobaxterm.mobatek.net) as it automatically uses X11 forwarding. Then you need to recompile your code as follows:
Code Block | ||
---|---|---|
| ||
# Load module, always specify version number. $ module load totalview ifort/2020.1.13 $ gfortran -g prog.f -L$TVLIB -ltvheap icc$ gcc -g prog.c -L$TVLIB -ltvheap |
Then start up Totalview you need to start an interactive PBS job as described in https://opus.nci.org.au/display/Help/TotalView. When the interactive job starts, you need to start up TotalView by executing the totalview
command and click on `Tools -> Memory DebuggingDebugging`.