Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Note that the accumulation of many files on /scratch for some projects users can cause nci-file-expiry to run slowly, or to run out of working memory before the utility prints a report. A project with on the order 10user having ~10**5 or more files in the "Warning" state may need to run nci-file-expiry via a PBS job script to allow sufficient cpu time and memory to generate an inventory of scratch files. This potential memory issue will only affect projects that have an extremely large number of scratch files. It is expected to ease after the initial quarantine-expiry cycles in May-June.

It is important to note that users must manage any files they own. A file in quarantine space can be identified and restored to active status only by its owner (userid), regardless of the files permission settings. It is not possible to implement a role-based (e.g. Lead CI) scratch file management utility at this time. Lead CI's should ask all project members to manage their own files. Use the "ls -al" command to list file ownership and permission information.

...

Important points to remember about the new /scratch file management process:

  • All users and projects project teams are encouraged to be proactive in managing their temporary storage on /scratch. The /scratch file system is intended for temporary, working storage. If you need persistent storage, use the /g/data or massdata systems, or download to your local filesystem.
  • If you have files on /scratch that you do not need please delete them as soon as possible. You do not need to wait for the automated quarantine-expiry process script to run. 
  • Users must manage files their own files. It is not possible to implement role-based, project access for /scratch file management at this time.
  • All projects with active NCMAS allocations now have /g/data directories. Default allocation is 2.5 GB/KSU.
  • Stakeholder projects will get /g/data allocations per entitlements and demand. For /g/data allocations, please contact your scheme allocation manager. NCI (help@nci.org.au ) can help put you in touch with the appropriate scheme managers if needed.
  • Project default scratch quotas will be raised at the time of July quarterly maintenance. (Note that default scratch quotas are still necessary to protect file system stability.)
  • Projects expecting to use large amounts of scratch capacity will still need to request appropriate quotas. Consultation with NCI HPC and Storage groups may be needed for projects intending to use peak-scale scratch capacity.
  • Large scratch requests (e.g. >= 10 TB) from projects with compute allocations of less than 1 MSU/year, or projects without demonstrated track records, will be accommodated in phases with advice from NCI Storage and HPC groups.
  • In general, exceptions to the scratch file expiry policy will not permitted. If you need advice or assistance to prepare for the full implementation of scratch file quarantine-expiry contact NCI user support: help@nci.org.au .
  • NCI may make adjustments to quarantine-expiry parameters to ensure operational stability of Gadi and the /scratch file system. If any changes become necessary, they will be communicated in advance to the user community via the Gadi MOTD, the NCI newsletter, and directed email information campaigns.

...