Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

First of all, you need to request multiple nodes in ARE JupyterLab session and specify the proper storage projects.

Image RemovedImage Added

Then click "Advanced options" button, and put "/g/data/dk92/apps/Modules/modulefiles" in "Module directories" and load both NCI-data-analysis/2022.06 and gadi-jupyterlab/22.06 modules in "Module" field. In the "Pre-script" field, fill in the command "jupyterlab.ini.sh -R" to set up the pre-defined Ray cluster.  

Image RemovedImage Added

Click "Open JupyterLab" button to open the JupyterLab session as soon it is highlighted.

Image RemovedImage Added

In the Jupyter notebook, using the following lines to connect the pre-defined Ray cluster and print the resources information. 

import ray
ray.init(address="auto")
print(ray.cluster_resources())

You will see 96 CPU Cores and two nodes are used by the cluster as expected.

Image RemovedImage Added

Monitoring Ray status

...

The Ray status will be kept updating every 2 seconds

Every 2.0s: ray status gadi-cpu-clx-114660021448.gadi.nci.org.au: MonThu MayJul 237 1511:2726:2759 2022
======== Autoscaler status: 2022-0507-2307 1511:27:26.840051480900 ========
Node status
---------------------------------------------------------------
Healthy:
1 node_ab2c1ef9316a7ee01bae8cd9d087f5e5dfbe3c8c254cf0e8752be0b1c166858d7a953a2050cd004f54239c0014cc9fcf34b640d7cac21de
1 node_b2f22752fc2b329e68649a81ba3c26f2d4f6080fa822dda0121d5514e6d5c4a8797357ebc0a5d0b76a6e264df133f0cba2dc236f696d7c8
Pending:
(no pending nodes)
Recent failures:
(no failures)

Resources
---------------------------------------------------------------
Usage:

96.0/96.0 CPU
0.00/239.062 GiB memory
23.70/106.446 GiB object_store_memory

Demands:
{'CPU': 1.0}: 225+ pending tasks/actors

...