Panel |
---|
borderColor | #21618C |
---|
bgColor | #F6F7F7 |
---|
titleColor | #17202A |
---|
borderWidth | 1 |
---|
titleBGColor | #FFB96A |
---|
borderStyle | ridge |
---|
title | Job Dependencies |
---|
|
The PBS scheduler on Gadi is capable of managing dependencies between jobs. This means you can tell PBS to run jobs at certain times and under a range of different conditions to make sure you are using your compute allocation efficiently and control the execution order for your jobs. There is no limit to the number of dependencies that you can set, but they do need to follow a sound flow of logic, otherwise commands could create unexpected conflicts or stange behaviour. This page is a short introductory into the concept of job dependencies, along with some examples of sound logic. |
When using dependencies, you need to flag it to the scheduler by using the command-W depend=
for example
Code Block |
---|
|
$ qsub -W depend=beforeany:1234567:1234578 job.sh |
This line uses the beforany command to tell the PBS scheduler that neither job 1234567
nor job 1234578
can start until job.sh
has begun running.
When using the -W
depend command, users must enter jobIDs joined by colons after the dependency type, as seen above. The only exception is when the dependency is set as 'on', in which case the following argument should be an integer matching the number of dependent jobs.
Common issues
One of the common problems that can occur with job dependencies, is jobs completing and leaving the queue sooner than expected. In this example:
Code Block |
---|
|
$ qsub job1
16394.r-man2
$ qsub job2
16395.r-man2
$ qsub -W depend=afterok:16394:16395 job3
16396.r-man2 |
This tells the PBS scheduler to run job3
after both job1
and job2
have completed with no errors. However, when Gadi is not busy, there is a chance thatjob1
orjob2
could complete so quickly that they have exited the queue before job3
has entered. In this case,job3
will be left sitting in the queue indefinitely, as PBS can't run it under its current logic.
A solution for this is to run the dependencies in a different order, for example
Code Block |
---|
|
$ qsub -W depend=on:2 job3
16397.r-man2
$ qsub -W depend=beforeok:16397 job1
16398.r-man2
$ qsub -W depend=beforeok:16397 job2
16399.r-man2 |
In this case, the 'on
' command is used, and is waiting for two jobs to be completed before it will run job3
. By setting up the dependencies this way, the last job to run actually enters the queue first then waits for the others to complete, negating the risk of a job leaving the queue early.
Tip |
---|
NCI recommends that you continue to monitor you jobs regularly to ensure that any complex dependency chains are running correctly and not stalled. |