...
Launching a predefined Ray cluster
Gadi
In your job script, you should load the gadi_jupyterlab module together with NCI-data-analysis/22.06. Then execute the jupyter.ini.sh to set up the predefined Ray cluster.
#!/bin/bash |
In the "script.py"After that, you can connect to the existing predefined Ray cluster by calling ray.init() and specify the address flag as "auto".
Code Block | ||||
---|---|---|---|---|
| ||||
import ray ray.init(address="auto") print(ray.cluster_resources()) |
...
{'object_store_memory': 114296048025.0, 'CPU': 96.0, 'memory': 256690778727.0, 'node:10.6.48.66': 1.0, 'node:10.6.48.67': 1.0}
ARE
First of all, you need to request multiple nodes in ARE JupyterLab session interface and specify the storage projects.
Then click "advanced options", put "/g/data/dk92/apps/Modules/modulefiles" in "Module directories" and load both NCI-data-analysis/22.06 and gadi-jupyterlab/22.06. In the Pre-script field, fill in the command "jupyterlab -R".
Wait until the JupyterLab session starts. Click "Open JupyterLab" button to open the JupyterLab session.
In the Jupyter notebook, run the following lines to connect to the predefined Ray cluster and print the resources in using.
You should see 96 CPU Cores and two nodes are in use.
Monitoring Ray status
You can easily monitor Ray status via the command "ray status". Open a CLI Terminal in the either JupyterLab session or Gadi PBS job and type in
$ watch ray status |
Then the The Ray status will be kept updating every 2 seconds:
...