Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

There exists a plethora of online material related to RAPIDS . One and one such example is this blog [https://developer.nvidia.com/blog/accelerating-single-cell-genomic-analysis-using-rapids/] which showcases accelerating how to accelerate single cell genomic analysis using RAPIDS. The authors of this blog released several notebooks to showcase their work and their code is expected to run as is these notebooks can be run on Gadi with some modifications. For the following example, we will look at the notebook [https://github.com/clara-parabricks/rapids-single-cell-examples/blob/v2021.12.0/notebooks/1M_brain_gpu_analysis_multigpu.ipynb] and show how to modify our RAPIDS environment and run this on Gadi.

...

Most of the packages imported in this notebook are available in rapids/2022.02. There are only two missing packages, scanpy and anndata. Please follow the instructions provided on this page under `Work with Other Python Packages` to gain an understanding of how to install additional packages.

For this example, we can will use pip to install scanpy:

...

where INSTALL_DIR defines where scanpy is will be installed.

As scanpy installs anndata automatically as part of its dependencies, you should see both of these packages available in $INSTALL_DIR/lib/python3.9/site-packages after successfully running the command above. 

...

Given the scanpy tests take less than 10 minutes to complete, you should be able to run these tests on the login node. For more intensive tests, please run in using a PBS job.

Info

There are some tests in the scanpy test suite that fail as their RMS values are greater than the expected tolerance. These failed tests are a good demonstration of the risks involved when running applications that are not built from source on Gadi. As the failed tests are not exceeding the tolerance limit by a large amount, we will proceed and use with using this installation for the following example.

...

The example notebook has a short section of code that downloads the input data. On Gadi, it needs to the downloading would need to be run on either the login nodes or the copyq nodes as Gadi compute nodes have no access to external networks. Given the dataset in this example is small, we will download it on the login node directly:

...

The notebook also calls auxiliary functions defined in another file hosted in the same GitHub repository. This file would also need to be downloaded to your working directory: 

...

Note that we use the v2021.12.0 version notebook . This and this file will may be different for other branches.

...

Once the required python packages, input data and auxiliary functions all available on Gadi, we can run our example notebook on a GPU node. To gain a deeper understanding of the inner workings of this notebook, it is recommended to run it interactively.

...

Note that this interactive job assumes your default project $PROJECT has enough SU to support a 2-GPU job for half an hour. If this is not the case, please replace $PROJECT with a project code that has sufficient compute resources. For more information on how to look up resource availability in projects, please read see the Gadi User Guide.

The interactive job also assumes the directory $INSTALL_DIR is located inside /g/data/${PROJECT} and $WORKDIR inside /scratch/$PROJECT where PROJECT ${PROJECT} where ${PROJECT} defines the project code that supports this job. If this is not the case, please revise the string passed to the PBS -lstorage directive accordingly. More information on PBS directives can be found here.

Once the job is ready, prepare the environment and initiate the dask cluster inside python3. Please note that when editing your PYTHONPATH, the variable INSTALL_DIR that was passed through the PBS job submission line is used. If this INSTALL_DIR is not accessible from the job, importing the scanpy and anndata packages would fail. 

Code Block
gpu-node $ module use /g/data/dk92/apps/Modules/modulefiles/
gpu-node $ module load rapids/22.02
gpu-node $ export PYTHONPATH=$INSTALL_DIR/lib/python3.9/site-packages:$PYTHONPATH
gpu-node $ python3
.
.
.
python3 >>> from dask_cuda import initialize, LocalCUDACluster
python3 >>> from dask.distributed import Client, default_client
...
python3 >>> cluster = LocalCUDACluster()
python3 >>> client = Client(cluster)
python3 >>> client
<Client: 'tcp://127.0.0.1:40259' processes=2 threads=2, memory=180.00 GiB>

The first necessary modification is shown above. When initiating the LocalCUDACluster, no argument is required as long as it expects workers to run on the same compute node. By running Running LocalCUDACluster() , it starts will start a local scheduler ready to connect with the same number of workers as the number of GPUs available inside the job. Since the Gadi gpuvolta queue has no more than 4 GPUs in a single per node, this method is only valid for jobs that require no more than 4 GPUs. To learn how to run tasks using more than 4 GPUs on multiple nodes, follow the instructions in Example 2 on this page.

...