We do not recommend submitting job arrays on Raijin due to a number of limitations in their scheduling. Job arrays are limited in size.
If you have many small jobs that all use similar resources and will finish around the same time, instead of submitting them individually, look into aggregating them into jobs with a combined resource requirement.
Since each job will spend reasonably significant overhead in setting up the environment, and the scheduler will have to consider each of them individually to optimise the cluster usage, having many small jobs is going to increase the total waiting time for your jobs in the queue. In addition submitting many small jobs to the queue will cause all other users within the project to have to wait until your jobs are processed before their jobs can make it to the execution queues as only 300 queued jobs are allowed in the execution queue per project.
If you have dozens to a few hundred identical multi-CPU jobs, we recommend using loops like this:
Please DO NOT submit THOUSANDS of single-CPU jobs using the example above!
You could run multiple single-CPU jobs in parallel if they all use similar resources and will finish around the same time by using Example 2 or 3:
Example 2 allows you to run 16 single-CPU jobs within one node.
Example 3 allows you to run 32 single-CPU jobs across two nodes:
Please note the ‘&’ at the end of the command line, and the ‘wait’ for all background tasks to finish.
Example 3 above assumes that the commands you run in each pbdsh command take approximately the same time. Unfortunately, this is not always the case and example 4 below shows how to run many single CPU tasks that may needdifferent time to execute.
Many thanks to one of our users, Scott Wales, for sharing this example.
When using example 4 it is important to select an appropriate number of CPUs vs. number of tasks that needs to be run. (A number of tasks in the example is equal to the number of lines in input file INPUTS). As a rough guide, select the number of cpus to be about 10 times smaller than the number of tasks. Example 4 uses 256 CPUs, this should be OK for 3000-10000 tasks.
Other useful parallel lines:
1) To run commands in the input.cmd file:
2) To run only one command on a node (assuming 16 cores per node):