Introduction
We have installed bioinformatics analysis software and workflow tools into a project code called hr32. These are located in /g/data/hr32 and can be accessed on gadi or via the ARE interface.
Software components
Software packages are available for both R and python:
- Python. A conda environment including a collection of python packages and some common bioinformatics utilities.
- More details are available in the Python environment document.
- R. A collection of R packages managed by bioconductor.
- More details are available in the R environment document.
Accessing the project
Resources can be accessed by joining the NCI code through mancini. The project code is only to access the software, and there are no storage or compute resources provided by accessing this project. Instead, you will use your own project code for doing the computation.
Getting started
On gadi
On gadi, four modules are available: NCI-bio-R
, NCI-bio-python
, HDFS
and spark
. To include these in your module search path, type:
module use /g/data/hr32/apps/Modules/modulefiles
At present the R and python modules have version 2021.07
. To load them:
module load NCI-bio-R/2021.07 module load NCI-bio-python/2021.07
Within a batch queue job on gadi, the option -lstorage=gdata/hr32
needs to be included to access the modules.
Running a jupyter notebook session is covered in depth in the existing NCI documentation. In brief, once the modules have been loaded within an interactive batch queue job, jupyter can be started with the command:
jupyter lab --no-browser --ip=$(hostname)
To access this session within a web browser on your local computer, start a ssh tunnel using the following command:
ssh -L <port>:<gadi-compute-node>:<port> -N <username>@gadi.nci.org.au
The details required for this command will be provided in the output printed when the jupyter session starts. For instance jupyter may print a line similar to:
http://gadi-cpu-clx-2014.gadi.nci.org.au:9999/lab?token=10f48aabcd3611ceb706f781210f360276bd1040e2224
The value of <port>
in the ssh command will then be 9999 and <gadi-compute-node>
will be gadi-cpu-clx-2014.
On ARE
The ARE server runs at are.nci.org.au. You can log in using the same password and username that you use to log onto other NCI services.
When starting a new jupyter session, the resources in hr32 can be loaded through the 'advanced options' by
- adding
/g/data/hr32/apps/Modules/modulefiles
in the Module directories box; - adding
NCI-bio-python/2021.07 NCI-bio-R/2021.07
in the Modules box.