Page tree

The /scratch file system is now regularly scanned for files not accessed for last 100 days. Such files are sent into quarantine. If the files are not recovered within 14 days from the day they landed in quarantine, they will be permanently deleted. More details are available at: Gadi /scratch File Management

NCI has provided a utility, nci-file-expiry that:

  • tells you if there are files going to expire soon;
  • lists files that have been sent to quarantine;
  • provides recovery options for one or many files at a time;
  • shows status of files being recovered; and
  • so on.

Note that quarantined files still count towards a project's quota, so lquota and nci_account commands will include them, but nci-files-report and du -sh commands will not.

There are several options available with nci-file-expiry:

$ nci-file-expiry -h
usage: nci-file-expiry [-h]
                       {list-filesystems,list-quarantined,list-warnings,recover,batch-recover,status}
                       ...

optional arguments:
  -h, --help            show this help message and exit

actions:
  {list-filesystems,list-quarantined,list-warnings,recover,batch-recover,status}
    list-filesystems    list known filesystems
    list-quarantined    list quarantined objects that are still recoverable
    list-warnings       list objects that will soon expire
    recover             request recovery of recently quarantined object
    batch-recover       request recovery of multiple quarantined objects
    status              show status of recovery requests


For each of these options, you can get further help. For example, if you may want to recover a quarantined file:

$ nci-file-expiry recover --help
usage: nci-file-expiry recover [-h] UUID PATH

positional arguments:
  UUID        UUID of quarantine record to recover
  PATH        location to store recovered object at (not the directory)
...


That means, you need to supply the UUID of the file and complete path where the file should be restored. If the target directory does not exist, you may have to create that manually and also need to specify the filename it should be saved as. Example:

$ nci-file-expiry recover ca7547b7-2066-4c24-9feb-864b23c038de /scratch/xxy/ab1234/dir1/dirTwo/geometry.py

If your target path includes special chars (like spaces, round brackets, etc.),  please add escape characters in the path or include the entire path in double quotes.

In case there are lots of files to be retrieved, consider using batch-recover option.

This, however, involves a few steps to setup:

  1. Get the full list of files in quarantine and redirect it to a file:

    $ nci-file-expiry list-quarantined > list_of_files.txt
  2. Edit this file to remove the entries for files that you are no longer interested in, including the top header line

  3. Use combination of Unix utilities like awk, grep, etc just to get the output file to contain UUID target_path_with_filename.
    Example:  awk '{print $1,$6}' < list_of_files.txt > final_list_of_files.txt

    Final entries in this file should be something like (example):

    395087c4-0555-4521-8fe4-6899796aef93 /scratch/xxy/ab1234/abc/xyz/FDS6.7.7.tar.gz
    397465c4-0532-4957-8af4-6899678eaf39 /scratch/xxy/ab1234/dec/a3f/random.txt
    ...


    Note that if the target directory does not exist, you may have to create that manually. nci-file-expiry will not create the path for you. Also for batch-recover option, target paths with special chars need not be fixed.

    You may also consider using the --json flag to list-quarantined action, if you would like to filter using Python etc.

  4. When this file is ready, just submit it for processing:

    $ nci-file-expiry batch-recover final_list_of_files.txt

    It will scan the file for errors and let you know if there are any (this will be mostly target path related).

  5. Depending on the number of files in the request, it may take sometime to recover all of them. You can keep checking the status of the recovery by running:

    $ nci-file-expiry status

    every once in a while.