Interactive jobs allow users to run jobs that can be monitored and adjusted at certain points during their lifespan. Users can submit jobs in an interactive shell on the head compute node, which allows them to test and debug code before running the entire job. NCI recommends that users utilise this resource to debug large parallel jobs or install applications that have to be built when GPUs are available. Instead of writing a PBS script, Interactive jobs are run as a command on the login nodes. This is done with the command $ qsub -I followed by the parameters you wish to run, for example Code Block |
---|
| $ qsub -I -qgpuvolta -Pa00 -lwalltime=00:05:00,ncpus=48,ngpus=4,mem=380GB,jobfs=200GB,storage=gdata/a00,wd
qsub: waiting for job 11029947.gadi-pbs to start
qsub: job 11029947.gadi-pbs ready |
Here we have a command submitting a job on the gpuvolta queue, through projecta00 , requesting 05:00 minutes of walltime . It asks for 48 CPUs ,4 GPUs ,380 GiB of memory , and 200 GiB of local disk space. Once this job begins, it will mount/g/data/a00 to the job and enter the job's working directory.
When you have submitted an interactive job, you will notice that your ssh prompt has changed from a login node to something similar to this Code Block |
---|
| [aaa777@gadi-gpu-v100-0079 ~] |
This means that you are now logged into a compute node, you can see that change in the example below. Code Block |
---|
| [aaa777@gadi-login-03 ~]$ qsub -I -lwalltime=00:05:00,ncpus=48,ngpus=4,mem=380GB,jobfs=200GB,wd -qgpuvolta
qsub: waiting for job 11029947.gadi-pbs to start
qsub: job 11029947.gadi-pbs ready
[aaa777@gadi-gpu-v100-0079 ~]$ module list
No Modulefiles Currently Loaded.
[aaa777@gadi-gpu-v100-0079 ~]$ exit
logout
qsub: job 11029947.gadi-pbs completed |
This is a very minimalistic example of an interactive job. by default, the shell doesn't have any modules loaded, if you need to load modules repeatedly inside interactive jobs, you can edit your ~/.bashrc file to automatically load them. Once you are finished with the job, run the command to terminate the job. |