Page tree
Skip to end of metadata
Go to start of metadata

Gadi job queues are designed to support mainstream production work (normal queue) and more rapid turnaround (express queue), in the same basic configuration as Raijin. Specialised queues are provided to address workloads which utilise specialised hardware. For the transitional phase one release of Gadi, there will be a limited set of queues active, with the others coming online as the underpinning hardware is transitioned to full access. 

Active Queues

Intel Xeon Cascade Lake

express

  • Express is a high priority queue for testing, debugging or quick turnaround.
  • Resource requests (time and number of CPUs) are more limited than the Normal queue. 
  • Charge rate: 3x normal: 6 SU per resource-hour (walltime)
  • For this queue, SUs are based on the higher of CPU request, or memory request divided by 4GB

normal

  • Normal is the standard production queue on Gadi.
  • Normal jobs allow the full range of resource requests.
  • Charge rate: 2 SU per resource-hour (walltime)
  • For this queue, SUs are based on the higher of CPU request, or memory request divided by 4GB

copyq

  • Copyq supports management and movement of data (e.g. copy, rsync).
  • Jobs runs on dedicated data mover nodes which have an external network interface. Remote data transfers are supported. (You may need to configure passwordless ssh.)
  • Data mover nodes support file I/O, tar, compression, and other standard file operations required for data management.
  • Use mdss commands in copyq jobs to copy data to/from the massdata system. Recommendation: always use "-lother=mdss" when using mdss commands. This will ensure that a job will only run if the mdss system is operational.
  • Copyq jobs do not support computation. 
  • Charge rate: 2 SU per resource-hour (walltime)
  • For this queue, SUs are based on the higher of CPU request, or memory request divided by 4GB

Pending Queues

Specialised Nodes - GPU

gpuvolta

  • 2x 24 core (Intel Xeon Cascade Lake Platinum 8268, 2.9 GHz) in 160 compute nodes
  • 4 x Nvidia Tesla Volta V100 Accelerator on each node
  • 384 GBytes of RAM on CPU
  • 480 GBytes of SSD local disk
  • Charge rate: xxx SU per CPU-hour (walltime).
  • #PBS -q gpuvolta
  • #PBS -l ngpus = 1, minimum ngpus request is 1.
  • #PBS -l ncpus = 12, minimum ncpus request is 12, in the multiple of 12, and 12 x ngpus

Specialised Nodes - Large Memory

hugemem

  • 2x 24 core (Intel Xeon Cascade Lake Platinum 8268, 2.9 GHz) in 50 compute nodes
  • 1.5 TB of Intel Optane DC Persistent Memory
  • 1.6TiB local disk (SSD)
  • Charge rate: 2.0 (TBC) SU per CPU-hour (walltime).

Specialised Nodes - Large Memory (ex-Raijin)

hugemembw

  • 2 x 14 cores (Intel Xeon Broadwell technology, 2.6 GHz) in 10 compute nodes
  • 1 TBytes of RAM
  • 400GB local disk (SSD)
  • charge rate of 1.25SU per CPU-hour
  • minimum number of ncpus request is 7; must be a multiple of 7.
  • #PBS -q hugemembw

megamembw

  • 3 compute nodes with 4 x 8 cores (Intel Xeon Broadwell technology, 2.1 GHz)
  • 3 TBytes of RAM per node
  • 800GB local disk (SSD)
  • Charge rate: 1.25 SU per CPU-hour (walltime).
  • Minimum number of ncpus request is 32; must be a multiple of 32.
  • Minimum memory request is 1.5TB per node.
  • Job script must declare "#PBS -q megamem".

Intel Xeon Broadwell (ex-Raijin)

expressbw

  • Expressbw is a high priority queue for testing, debugging or quick turnaround on the Broadwell nodes.
  • Broadwell nodes have 2 x 14-core Intel Xeon E5-2690v4 (Broadwell) 2.6GHz CPUs.
  • Resource requests (time and number of CPUs) are more limited than the normalbw queue. 
  • Limits on small limits particularly on time and number of CPUs
  • Charge rate: 3.75 SU per CPU-hour (walltime).

normalbw

  • Normalbw is the standard production queue for the Gadi Broadwell nodes.
  • Broadwell nodes have 2 x 14-core Intel Xeon E5-2690v4 (Broadwell) 2.6GHz CPUs.
  • Normalbw jobs allow the full range of resource requests on the Broadwell nodes.
  • Charge rate: 1.25 SU per CPU-hour (wall time). 

For more detailed specifications see the Broadwell Compute Nodes page.

Intel Xeon Skylake (ex-Raijin)

normalsl

  • Normalsl is the default queue for the Skylake nodes.
  • Skylake nodes are configured with 2 x 16-cores (Intel Xeon Gold 6130, 2.1GHz) Skylake processors, 192 GBytes RAM, and 400 GBytes of SSD local disk in each node.
  • Charge rate: 1.5 SU per CPU-hour (walltime).

For more detailed specs see Skylake Compute Nodes


  • No labels