NCI systems will be offline from 5:00 am AEST Thursday 12 September until 6:00 pm AEST Friday 13 September to support maintenance essential upgrades of critical facility infrastructure for electrical power and cooling infrastructure. These upgrades provide essential support are critical for future HPC and data systems with greater performance and capacitycapabilities.
All NCI services, including HPC compute Gadi compute, storage network, ARE, Nirin cloud and storage systems, user support (portal , and email), MyNCI, documentation, and Nirin cloudand opus.nci.org.au (documentation), will be unavailable to users on Thursday 12 September and Friday 13 September. Production systems will be powered off to accommodate the during maintenance plan. Physical access to NCI Building 143 will also be restricted to essential staff on 12-13 September.
NCI expects to restore production services as soon as possible during normal business hours on Friday 13 September as elements of the maintenance work plan are completed. NCI will provide more specific information about the service restoration schedule in further advisory communications.
Please note that 2024 Q4 maintenance remains scheduled for 5 November (8 am - 6 pm AEDT).
Gadi users can do the following to prepare for this scheduled downtime:
Run any significant Gadi jobs needed to support milestones or deadlines as soon as possible.
Back up or move any important data in /scratch directories to appropriate longer-term storage locations, e.g. project directories on /gdata.
Pause any automated scripts used for routine tasks involving Gadi, /scratch, /gdata or cloud or data services, and develop a plan to restart these after completion of maintenance.
Plan for interruption to any services which rely on Lustre file systems: /scratch, /gdata.
Gadi job queues are now draining will be drained in advance of the scheduled downtime. Queued jobs that would normally be scheduled to run during the maintenance period will be held until Gadi resumes normal operation.
All ARE services will be shut down by NCI prior to the maintenance downtime at 5:30 am AEST Thursday 12 September. ARE services will be automatically restored at completion of the maintenance downtime.
All virtual machines running on the NCI Nirin Cloud service will be shut down no later than 5:30 am AEST on Thursday 12 September.
If your project VMs have application dependencies, for example, a database service which should be shutdown before a presentation service, the NCI Cloud team requests that you perform a controlled shut down of all VM resources before 6:00 pm AEST on Wednesday 11 September. All projects will need to plan to restart all VM services via the Nirin Dashboard (https://cloud.nci.org.au) after completion of the maintenance downtime. Note that automated VM restart using metadata tags is no longer supported for Nirin cloud.
All ARE services will be shut down by NCI prior to the maintenance downtime at 5:30 am AEST Thursday 12 September. ARE services will be automatically restored at completion of the maintenance downtime.
Contact NCI user support if you have specific questions about how to prepare for the scheduled maintenance downtime 12-13 September: help@nci.org.au .
Support requests received during scheduled maintenance will be addressed as soon as possible after NCI resumes normal operations. NCI recommends that users watch the Gadi message of the day service for status information during the maintenance downtime.