Queue Limits

The current default walltime and resource limits for Gadi jobs are summarised below. If higher limit on core count (PBS_NCPUS) and walltime is needed, please launch a ticket on NCI help desk with a short description of the reasons why the exception is requested. For example, what simulation uses which solver provided by which application that doesn't allow check-points or the current scalability study suggests linear speedup at the core count beyond the current PBS_NCPUS limit. We will see how we can help on a case-by-case basis.

Queue		Max queueing jobs per project	Charge rate per resource*hour ‡	PBS_NCPUS	Max PBS_MEM/node †	Max PBS_JOBFS/node †	Default walltime limit
normal	normal(route§)	1000	2 SU	1-48 multiple of 48	190 GB	400 GB	48 hours for 1-672 cores 24 hours for 720-1440 cores 10 hours for 1488-2976 cores 5 hours for 3024-20736 cores
normal	normal-exec§	300	2 SU	1-48 multiple of 48	190 GB	400 GB
express	express(route)	1000	6 SU	1-48 multiple of 48	190 GB	400 GB	24 hours for 1-480 cores 5 hours for 528- 3168 cores
express	express-exec	50	6 SU	1-48 multiple of 48	190 GB	400 GB	24 hours for 1-480 cores 5 hours for 528- 3168 cores
hugemem	hugemem(route)	1000	3 SU	1-48 multiple of 48	1470GB	1400 GB	48 hours for 1-48 cores 24 hours for 96 cores 5 hours for 144, 192 cores
hugemem	hugmem-exec	50	3 SU	1-48 multiple of 48	1470GB	1400 GB
megamem	megamem(route)	300	5 SU	1-48 multiple of 48	2990 GB	1400 GB	48 hours for 1-48 cores 24 hours for 96 cores
	megamem-exec	50	5 SU	1-48 multiple of 48	2990 GB	1400 GB	48 hours for 1-48 cores 24 hours for 96 cores
gpuvolta	gpuvolta(route)	1000	3 SU	multiple of 12	382 GB	400 GB	48 hours for 1-96 CPU cores 24 hours for 144-192 CPU cores 5 hours for 240-960 CPU cores
gpuvolta	gpuvolta-exec	50	3 SU	multiple of 12	382 GB	400 GB
normalbw	normalbw(route)	1000	1.25 SU	1-28 multiple of 28	128GB, 256GB	400 GB	48 hours for 1-336 cores 24 hours for 364-840 cores 10 hours for 868-1736 cores 5 hours for 1764- 10080 cores
normalbw	normalbw-exec	300	1.25 SU	1-28 multiple of 28	128GB, 256GB	400 GB
expressbw	expressbw(route)	1000	3.75 SU	1-28 multiple of 28	128GB, 256GB	400 GB	24 hours for 1-280 cores 5 hours for 308-1848 cores
expressbw	expressbw-exec	50	3.75 SU	1-28 multiple of 28	128GB, 256GB	400 GB	24 hours for 1-280 cores 5 hours for 308-1848 cores
normalsl	normalsl(route)	1000	1.5 SU	1-32 multiple of 32	192 GB	400 GB	48 hours for 1-288 cores 24 hours for 320-608 cores 10 hours for 640-1984 cores 5 hours for 2016-3200 cores
normalsl	normalsl-exec	300	1.5 SU	1-32 multiple of 32	192 GB	400 GB
hugemembw	hugemembw(route)	500	1.25 SU	7, 14, 21, 28 multiple of 28	1020 GB	390 GB	48 hours for 1-28 cores 12 hours for 56-140 cores
hugemembw	hugemembw-exec	100	1.25 SU	7, 14, 21, 28 multiple of 28	1020 GB	390 GB	48 hours for 1-28 cores 12 hours for 56-140 cores
megamembw	megamembw(route)	300	1.25 SU	32, 64	3000 GB	800 GB	48 hours for 32 cores 12 hours for 64 cores
megamembw	megamembw-exec	50	1.25 SU	32, 64	3000 GB	800 GB	48 hours for 32 cores 12 hours for 64 cores
copyq	copyq(route)	1000	2 SU	1	190 GB	400 GB	10 hours
copyq	copyq-exec	50	2 SU	1	190 GB	400 GB	10 hours
dgxa100	dgxa100(route)	50	4.5 SU	multiple of 16	2000 GB	28 TB	48 hours for 16-128 cores 5 hours for 144-256 cores
dgxa100	dgxa100-exec	50	4.5 SU	multiple of 16	2000 GB	28 TB	48 hours for 16-128 cores 5 hours for 144-256 cores
normalsr	normalsr(route§)	1000	2 SU	1-104 multiple of 104	500 GB	400 GB	48 hours for 1-1040 cores 24 hours for 1144-2080 cores 10 hours for 2184-4160 cores 5 hours for 4264-10400 cores
normalsr	normalsr-exec§	300	2 SU	1-104 multiple of 104	500 GB	400 GB
expresssr	expresssr(route)	1000	6 SU	1-104 multiple of 104	500 GB	400 GB	24 hours for 1-1040 cores 5 hours for 1144- 2080 cores
expresssr	expresssr-exec	50	6 SU	1-104 multiple of 104	500 GB	400 GB	24 hours for 1-1040 cores 5 hours for 1144- 2080 cores

† To make sure your jobs can be handled properly when they are terminated because of usage exceeding the memory and/or the local disk limit, please request no more than the amount listed in the corresponding column.

‡ The number of `resource` is calculated as ncpus_request * max[ 1, (ncpus_per_node/mem_per_node)*(mem_request/ncpus_request)].

§ The route queue is where jobs stay before they go to the execution queue. Only jobs in the execution queues, whose names end with `-exec`, are considered to be run on compute and data mover nodes by PBS.

Page tree

Queue Limits