Job Submission Tutorial...

Job Submission

This tutorial is designed to show you how to download, compile, and run a simple job on Gadi, allowing you to practice submission before running your own binary.

In this instance you will be running a simple 'Hello world' job, which will teach you the basics of building and submitting a job, while also showing you some tips on monitoring the job as it runs.

For the majority of jobs, you can run a similar workflow to this tutorial, that being:

Login to Gadi
Download the files you need
Compile them
Create a PBS job script
Submit the job
Gather the output

On this page

Downloading and Compiling
Creating a Job Script
Submitting and Monitoring your Job

Downloading and Compiling

To begin, open your prefered SSH shell and log into gadi.

Once you are logged in, you will need to download the program, this can be done by clicking this link, hello_mpi.c

You will need to transfer this file into your home drive, for guidance on this, please see our file transfer guide.

To build and compile this program into something that can run on Gadi, we will be using the command mpicc, which is used to compile mpi programs written in C. We will also be using openmpi to help with the compilation.

To do so, we need to load the openmpi module into the environment. You can use, our software applications guide, for an indepth look into loading and unloading software, but for now we will run the command

$ module avail openmpi

OpenMPI is an open-source message passing interface (MPI) that assists with running parallel jobs on a supercomputer.

As you can see on the right, the 'module avail openmpi' command lists all of the available versions of openmpi on Gadi. NCI also recommends that you use a version that you know is compatible with your binary, limiting the risk of unnecessary errors. In this case, we will use the latest version of openmpi by using the command

$ module load openmpi/4.1.5

You can check what modules you have loaded into the environment by running the command

$ module list

You should see that openmpi has been loaded, along with PBS, which is automatically loaded upon login.

Once successfully loaded, we can now begin compiling the job, to do this run the command

$ mpicc -o hello_mpi hello_mpi.c

This will compile a binary file for you, named hello_mpi, which will be created in your home drive. If you want to check that it was produced successfully, you can run

$ ls

On the right you can see the original file, hello_mpi.c, and the new file, hello_mpi. This is the file you will use to run your job, and will be referenced when producing a PBS script for the job.

Creating a Job Script

To run jobs on Gadi, users need to create a PBS submission script. A PBS script is a list of parameters that will tell Gadi what resources you want to use while running the job. You can outline how much walltime your job will take, along with the CPU's required, and how much memory to allocate to it. You can read our job submission guide for an indepth look at how to write submission scripts, along with our PBS environment guide for useful ways to customise your scripts.

For this job we will be designing a very simple script. This can be written in whichever editor you wish, e.g. vim, nano, etc, and contains simple lines that tell PBS what how you want to run your job. We will use vim in this case and you can open it by simply running the command

$ vim

For people new to Linux and text editors in general, you can get a great break down of vim and its uses, including short lessons, by running

$ vimtutor

Once vim is open, you can start preparing your script, it should look similar to this simple outline below.

#!/bin/bash

#PBS -P <Project code>
#PBS -q normal
#PBS -l ncpus=4
#PBS -l mem=16gb
#PBS -l walltime=00:10:00

module load openmpi/4.1.5
mpirun hello_mpi

These parameters tell the PBS scheduler how you would like to run your job, in this case:

What project you'd like to draw resources from
The queue you'd like the job to run in
The number of CPUs to use
The amount of memory to use in the job
How much walltime you estimate the job to need.
The script will then load openmpi/4.1.5 and tells it to run the binary file we created earlier, hello_mpi

To save your script, enter :wq script.sh to write(w), quit(q) and name your file script.sh. This will save your script and send you back to the your home directory, where you will be ready to submit your job to Gadi.

Submitting and Monitoring your Job

You've now compiled your job and written your job script, it's time to submit your job to the PBS scheduler.

Jobs are submitted using the command qsub. For this job, we can run the command

$ qsub script.sh

Which will submit the job and give you a jobID, as you can see below

You can now begin monitoring and get data on what is happening while it is running on Gadi.

There are several methods to monitoring jobs and analyse data, for this tutorial, we will focus on one of the simpler methods. For a more indepth look at ways to monitor jobs, please see our job monitoring page.

To monitor your job while it is running, you can enter the command

 $ qstat -swx <jobID>

Which will print information like this

As you can see in this screenshot, the job entered the queue and finished, and as it is a very small job, it took only one second to complete

This will produce two new files in your home directory, with a filename script.sh.o<JobID> and script.sh.e<JobID>. The first file, with 'o' in the file name, is the output of the job, which should have run successfully on Gadi. The second file, with 'e' in the file name, is the error stream. this will document any errors that occured while the job was running. In this case, the error stream file should be empty.

The output file can be opened by running the command

$ less script.sh.o<JobID>

Giving you this information

This shows the job reporting back 'Hello' from the 4 CPUs that we requested, along with other information that will be useful when running your own jobs. If your job script asked for more than four CPUs, they would all be echoing 'Hello' along with the CPUs that you see here.

Congratulations!

You have run your first job on Gadi!

Although it might seem small, it is a great stepping stone to learning more about high performance computing.

Authors: Andrew Johnston

Page tree

Job Submission Tutorial...