Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Info
titleData Centre

19  Sep 2024 10:10am

Systems Impacted: Compute nodes in normal, express, normalsr, expresssr, and gpuvolta queues

Dear NCI Users,

A cooling fault in the NCI Data Centre has lead to the nodes backing the normal, express, normalsr, expresssr, and gpuvolta queues being powered off to protect the hardware. Cooling has now been restored and NCI staff are bringing these nodes back online.

Any jobs that were running on these nodes will have been lost.


Regards,
NCI User Services

Info
title/g/data

24  Apr 2024

Systems Impacted: gdata1a

Filesystem Fault

Dear NCI Users,

NCI system admins have reported problems with gdata1a filesystem. If your session to Gadi is hanging it is possible your projects are on that filesystem. System admins are working on finding and fixing the issue. This page will be updated as soon as there is more information.

Update: Filesystem resumed normal operations since 1:19pm. 


Regards,
NCI User Services

...

Info
titleCore cloud infrastructure

4 Jan 2024 9:00am

Systems Impacted: ARE, Nirin VM, accessdev and other services relying on cloud infrastructure

Hardware Faults

Dear NCI Users,

We had hardware issues on the core cloud infrastructure which causes the impacted systems unresponsive.

We have identified a fix for this issue and we are implementing it now.

If you require further assistance, please contact NCI User Services via the Helpdesk at https://help.nci.org.au or help@nci.org.au.

Update 4 Jan 12:48pm: All the impacted compute nodes are back up. Users will need to verify that their services are properly functional or not. Any users with instances that went down can restart them now.

Regards,
NCI User Services