Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The second section with all the lines that start with ‘#PBS’ specifies how much of each resource the job will need. It requests an allocations of 48 CPU cores, 190 GiB memory, and 200 GiB local disk on a compute node from the normal queue for its exclusive access for 2 hours. It also requests the system to mount both the a00 project folders on the filesystems /scratch and /g/data inside the job, and to enter the working directory once the job is started. Please see more PBS directives explained in here.  Note that a ‘-lstorage’ directive must be included if you need access to /g/data , otherwise these folders will not be accessible when the job is runningis needed.

To find the right queue for your jobs, please browse the Gadi Queue Structure and Gadi Queue Limit pages. 

Info

Users are encouraged to request resources to allow the task(s) to run around the ‘sweet spot’ where the job benefits from parallelism and achieves shorter execution time while utilising at least 80% of the requested compute capacity.

While searching for the sweet spot, please be aware that it is common to see components in a task that run only on a single core and cannot be parallelised. These sequential parts drastically limit the parallel performance. For example, having 1% sequential parts in a certain workload limits the overall CPU utilisation rate of the job when running in parallel on 26 48 cores to less than 80%70%. Moreover, parallelism adds overhead which in general scales up with the increasing core count and, when beyond the ‘sweet spot’, results in a waste of time on unnecessary task coordinations.

...

Info

Users are encouraged to run their tasks if possible in bigger jobs to take advantage of the massive potential massive parallelism that can be achieved on Gadi. However, depending on the application, it may not be possible for the job to run on more than a single core/node. For applications that do run on multiple cores/nodes, the commands and scripts/binaries called in the third section determine whether the particular job can utilise the requested amount of resources or not. Users need to edit the script/input files which define the compute task to allow it, for example, to run on multiple cores/nodes. It may take several iterations to find the ideal details for sections two and three of the submission script when exploring around the job's sweet spot.

...