Gadi is Australia’s most powerful supercomputer, a highly parallel cluster comprising more than 150,000 processor cores on ten different types of compute nodes. Gadi accommodates a wide range of tasks, from running climate models to genome sequencing, from designing molecules to astrophysical modelling.

Introduction to Gadi is designed for new users, or users that want a refresher on the basics of Gadi.

To register for this training, register your session via this link

If you have any questions regarding this training, please contact training.nci@anu.edu.au.

Date/Time

March 3, 2022, 2:00-3:30pm AEDT

April 7, 2022, 2:00-4:00pm AEDT

May 12, 2022, 2:00-4:00pm AEST

June 2, 2022, 2:00-4:00pm AEST

July 7, 2022, 2:00-4:00pm AEST

Augest 4, 2022, 2:00-4:00pm AEST

September 1, 2022, 2:00-4:00pm AEST

October 6, 2022, 2:00-4:00pm AEST

November 3, 2022, 2:00-4:00pm AEDT

Prerequisites

The only prerequisite for this course is that you have an active NCI user account ready for login. If you do not have an NCI user account you may register for this course, however you will not be able to take full advantage of any hands-on exercises.

Attendees are strongly encouraged to review the following pages, which contain essential background information, before the course.

Objectives

The training is aiming to empower attendees to work with confidence on Gadi with the basic understanding of

resource accounting
the difference among login, compute and data-mover nodes
job submission and management
module environment, for using software applications
basic skills to plan, track and manage their jobs on Gadi.

Learning outcomes

At the completion of this training you will be able to

login to Gadi
transfer data on and off Gadi
run module commands to customise user environment and configure software applications
submit jobs
check and maintain compute, storage, and job status
estimate job cost
request resource adequate for your jobs
monitor job status/progress and its resource utilisation
understand common reasons why jobs finish with errors
ask questions about jobs like a pro

Topics covered

Login nodes and login environment
Shared filesystems and jobfs
Home, lustre, and tape filesystems
Home and project folders
Data transfer and data mover nodes
Compute grant, resource hours and PBS queues
Job submission and output/error logs
Applications, modules and software groups
Login, copyq, and different compute nodes
Job cost and resource hours
PBS directives
Tools for job monitoring before, during and after the run
Common reasons of why jobs are not running
Common reasons of why jobs are failed

Course Contents

The course content is available as PDF below. Last updated 3 March 2022.

Page tree

Introduction to Gadi