Within Gadi there are 10 login nodes, 6 data-mover nodes, over 4000 compute nodes, and NCI's massdata tape storage.
Below is a deeper look into those that expands on what they are, what they do, and what they connect to.
Navigating through the directories on login nodes is simple if you can remember the rules outlined below.
If you keep these formats in mind as you are using Gadi, you will have no problems navigating to the directories that you need.
Your home directory will always be located at /home/institution/username
Scratch will always be at /scratch/project/username
/g/data follows the same format as scratch with /g/data/project/username
. Note: Not all projects have Global Data storage space.
All software applications are found at /apps/software/version
Owner> User
Accessible from> PBS jobs and login nodes
Size limit> 10 GiB with zero extensions available
Allocation valid until> Users account is deactivated
Resource specific attributes:
Owner> Project
Accessible from> PBS Jobs† and login nodes
Size limit> 1 TiB by default with more available on request
Allocation valid until> Project completion or job demand changes
Resource specific attributes:
† Need to be explicitly mounted using the PBS directive
`-lstorage`. Please see our PBS directives listing for more information.
Owner> Project
Accessible from> PBS Jobs† and login nodes
Size limit> Amount is set by the scheme manager
Allocation valid until> Project completion or Scheme closure
Resource specific attributes:
† Need to be explicitly mounted using the PBS directive `-lstorage`. Please see the jobs submission page for more information.
Owner> Project
Accessible from> PBS copyq Jobs† and login nodes
Size limit> Amount set by scheme manager
Allocation valid until> Project completion or Scheme closure
Resource specific attributes:
Owner> User
Accessible from> PBS Jobs*
Size limit> SSD Disk space available on the job's hosting node(s) Default 100MB
Allocation valid until> Job termination
Resource specific attributes:
* Job owner can access the folder through commands like `qls` and `qcp` on the login node during the job.
Owner> User
Accessible from> PBS Jobs
Size limit> All-flash NetApp EF600 storage, volumes available on request
Allocation valid until> Job termination
Resource specific attributes:
Please refer to out I/O Intensive page for more information about this system.
Owner> NCI
Accessible from> PBS jobs and login nodes
Size limit> N.A
Allocation valid until> N.A
Resource specific attributes:
Owner> Software group owner
Accessible from> PBS jobs and login nodes
Size limit> Available seats on the licencing server
Allocation valid until> Licence expiry date
Resource specific attributes:
There is also a quota called iQuota
that is applied to /scratch
and /g/data
. This limits the maximum number of files and folders allowed within a project. you can see the amount of iQuota
by running the command
$ lquota
Please try to keep the number of files as low as possible as this can affect the I/O performance in your job. Gadi is efficient at handling large scale parallel I/O but performance becomes significantly worse when doing frequent small small operations.
A main culprit for creating a large amount of files is the Python packaging system conda.
Please use pip
and the available modules that are already tuned for Gadi to keep file and folder count as low as possible.
Any job submitted to Gadi is allocated a default 100 MB of storage space on the hosting nodes SSD. NCI encourages users to utilise the folder $PBS_JOBFS
in jobs that generate a large number amount of small I/O operations. This will boost your jobs performance by saving the amount of time that would spent running those small operations and frees up space for your project in /scratch
and /g/data.
You can also request space on multiple compute nodes by adding the directive -l jobfs
to your job script, for example,
#PBS -l jobfs=100GB
Would request 100 GiB on the nodes. If this job was to run on multiple nodes, this 100 GiB would be equally distributed among all of them. Jobs that request more disk space than is available on the nodes will fail. Please check the queue structure and queue limits pages for information on how much local disk is available.
The limit on $PBS_JOBFS
is 400 GiB.
NCI operates a tape filesystem called massdata to provide a reliable archive for projects to backup their data. The data is held on magnetic tape, which is held in separate machine rooms in two seperate buildings. The tapes are accessed and transported by a small robot that works tirelessly for NCI.
While projects do have their own path on massdata, i.e. massdata/<projectcode> there is no direct access to it via Gadi. Data requests from the tape library must be launched from within the login nodes or via a copyq job. You can read our job submission page to learn how to submit copyq
jobs.
NCI provides the `mdss` utility for users to manage the migration and retrieval of files between multiple levels of a storage hierarchy: from on-line disk cache to offline tape archival. It connects to massdata and launches the corresponding requests. For example, `mdss get
` first launches the requests to stage the remote files from the massdata repository into the disk cache, once the data gets online it then transfers the data back to your local directory, for example, a project folder on /scratch or /g/data.
To the right are some simple commands that can help while navigating massdata. These commands can be run from the login nodes and begin with the prefix 'mdss'
, for example
$ mdss get
You can read the manual for mdss by running the command
$ man mdss
Which will provide you several more ways to interact with the storage library.
put | copy files to the MDSS |
get | copy files from the MDSS |
|
|
ls | list directories |