This page provides an overview of what users can expect as they make the transition to Gadi in 2019 Q4. There is a lot of information here, so please take the time to read this page carefully. NCI will update this page and provide more detailed information here as it becomes available.
If you have questions or special concerns about the transition from Raijin to Gadi please let us know as soon as possible; send an email to NCI user support at firstname.lastname@example.org.
Updated 4:00pm AEST
|NOW||NCI data centre preparation and Gadi installation in progress. Users preparing for Gadi.|
|Gadi and Raijin available to users. Jobs can be run (independently) on both systems.|
|Raijin job submission ends.|
|Raijin nodes go offline for decommissioning.|
Raijin /short file system available on Gadi login nodes (alternative path) for user transfers.
|Broadwell, Skylake and P100 and K80 GPU nodes available on Gadi.|
|Gadi resource allocations for 2020 Q1 installed.|
|Gadi operational at full production specification.|
|Raijin /short file system decommissioned.|
Please note that this timeline will be updated as often as necessary to reflect installation activities and dependencies.
- The user's default shell and project will be controlled by the file gadi-login.conf in each user's $HOME/.config directory.
- Gadi /home quotas are applied on a per-user basis, as on Raijin.
- Gadi /home quotas will be 10 GB.
- Gadi login and compute nodes will run the CentOS 8 operating system.
- N login nodes... Same round-robin login service?
- The basic compute charge rate for Gadi is 2.0 SU per core-hour. This charging reflects Gadi's CPU performance relative to Raijin.
- All NCI allocations for 2020, including NCMAS, will be on Gadi only.
- Compute allocations on Gadi are managed by stakeholder scheme managers. See this page (link) to get contact details for your scheme manager.
- A compute allocations on Gadi will apply to a project, as on Raijin.
- in 2019 Q4, all active projects will be given Gadi compute quotas which match (pro-rata) their 2019 Q3 or Q4 Raijin allocations.
- During the Raijin-Gadi operating period, compute (job) accounting is independent on each system. A project may consume its full allocation on Raijin, and also its full allocation on Gadi, with no penalty.
File Systems - /home
- Gadi /home is a new, independent file system.
- The quota on Gadi home directories will be 10 GB, as compared to a 2 GB quota on Raijin. Home directories are intended for irreproducible files, e.g. source code and configuration files, and users are expected to utilise /scratch, /g/data and JOBFS file systems for working data.
- The contents of Raijin user home directories will not be migrated to Gadi.
- Raijin /home will be available on Gadi via a temporary, read-only path to help users manage valuable home directory files until Gadi is fully operational. Users are strongly encouraged to copy only essential files from Raijin to their home directories on Gadi.
File Systems - /scratch
- The temporary file system for Gadi users is /scratch. Note that the path '/short', as used on Raijin, will not exist on Gadi.
- The contents of Raijin /short will not be migrated to Gadi /scratch.
- Raijin /short will be available on Gadi via a temporary, read-only path on login and data mover nodes only until . Users are strongly encouraged to copy only essential files to Gadi /scratch.
- Data transfer rates from Raijin /short to Gadi /scratch are expected to be approximately 1 TB per hour. Please plan your transfers accordingly, and do not wait until the last minute.
- Gadi /scratch will be subject to an automated file purging policy: files will be removed 90 days after the time of last modification. In the interest of fairness, exceptions to this policy are not permitted. Any attempts to circumvent the 90-day purge policy by using the touch command or other strategies will result in account deactivation.
- All projects will be provided with a default /g/data directory for storage of persistent data. The default quota for /g/data project directories remains to be finalised. Note that allocations for projects which already have /g/data access will not change.
- Plan to modify your workflow(s) to place temporary files on /scratch, and persistent files on /g/data.
File Systems - /g/data
- The /g/data file systems will be available on Gadi, and on Raijin during the Gadi pre-production phase.
- /g/data file system performance may
- PBS Pro version 1X...
- Gadi queues...
- Gadi queue limits...
- Resource exemptions (c.f. nf_limits) established on Raijin will not be carried across to Gadi. Any job resource exemptions on Gadi will need to be compellingly justified.
- The PBS_JOBFS size on Gadi will be limited to 400 GB per node. Jobs that require more than 400 GB/node are expected to use /scratch.
- Jobs on Gadi must explicitly declare (via PBS '-lother=...' directives) which file systems are to be accessed. As an example, a job which will read or write data in the /g/data1a/project directory must include the directive '-lother=gdata1a'; the job will fail without the directive.
- Job scheduling will be determined at project granularity, as was the case on Raijin. It is not possible to schedule jobs on a per-user basis.
- Jobs will be charged according to the resources requested, that is, by number of CPUs or amount of memory requested, whichever is larger. Note that use of memory was not explicitly charged on Raijin.
|Intel Xeon E5-2670 (Sandy Bridge)||Intel Xeon Platinum 8274 (Cascade Lake)|
|Two physical processors per node|
Two physical processors per node
|2.6 GHz clock speed||3.1 GHz clock speed|
|16 cores per node||48 cores per node|
332 GFLOPs per node
|4761 GFLOPs per node|
- Executables will be binary compatible between Raijin and Gadi. To obtain optimum performance all executables should be rebuilt for Gadi's processor architecture.
- The most recent versions of third-party software packages which are widely used will be built by NCI staff and installed in the Gadi /apps directory. The criteria of 'widely used' is considered to be more or less continuous usage by three or more independent research groups.
- Unfortunately, it is not possible to build and install all older versions of third-party software. NCI may consider cases of older software if there is a compelling demonstration of need and there are no issues with regard to dependencies or processor architecture.
- NCI can assist with local builds of third-party software for individual research groups on Gadi, as on Raijin. Please note that during the transition to Gadi staff time may be limited and software assistance may be deferred until Gadi is fully operational.
- The modules command will work essentially the same on Gadi as on Raijin.
- Python 2.7.16 will be provided, however this will be the final version of Python 2 installed on Gadi. All users are encouraged to move to Python 3 as soon as possible.
- Documentation of application software on Gadi is in progress. Watch this space...
- Adjust your timeframes to account for Gadi's faster and more efficient CPUs.
- Gadi nodes have 48 CPUs and 192 GB memory. Single node workflows can now use up to 48 CPUs and 192 GB of memory.
- Containers will be available on Gadi, however, for security and compatibility reasons NCI staff will need to build the container image. Which containers...?
- NCI's VDI service is independent of Raijin and Gadi. With Raijin, the /apps third-party application tree was copied to the VDI environment. With Gadi, there will initially be no change to the VDI operating environment, however, differences in operating systems and architectures will eventually lead to divergence in the GADI and VDI application software stacks. Users with questions about VDI software are advised to contact NCI user support.
- NCI's cloud is independent of Gadi. No changes to NCI cloud operations are expected as we bring Gadi into production.
If you have further questions or concerns about the transition from Raijin to Gadi please let us know - contact NCI user support at email@example.com.