Page tree

Recovering files on /home

In terms of backups, /home is the only filesystem on Gadi that provides any sort of failsafe in case of accidental deletion.

Every /home folder has a hidden directory that doesn't show up in a contents listing. This directory, ~/.snapshot, contains various snapshots of your home directory. 

If you've accidentally deleted a file and want to revert back to a snapshot prior the this unfortunate event, you can run,

$ ls ~/.snapshot

to get a print out similar to this image on the right. 

Each of these folders is a snapshot of your entire /home folder at the time that is stamped in the suffix. from here you can choose the one that closest resembles the time you want to revert to and get your file back. Let's say you wanted to revert to the top most hourly backup, you would run, 

$ cp -r ~/.snapshot/hourly.2023-08-18_0105/lost_folder ~/recovered_folder

These snapshots aren't included in your 10 GiB quota on /home, but the recovered file will be so you need to be aware of how much space the file will take up and how much room is left in your quota.

To check this, you can run the command below, which gives you the size of the folder you are copying and storage left in /home.

$ cd ~/.snapshot/hourly.2023-08-18_0105/
$ du -sh lost_folder
306M    lost_folder
$ du -sh $HOME
3.1G    /home/<111>/<aaa111>

How to recover lost files on /scratch

/scratch isn't intended to be a long term storage area, it is intended as an area to run your jobs and then shift the data away. As a result, /scratch is scanned regularly for files that haven't been accessed in 100 days.

Any files that aren't accessed in this time period are then placed into quarantine, files placed in quarantine will be permanently deleted after 14 days unless they are recovered.  

To help with this, NCI has provided a tool that

  • tells you if there are files going to expire soon.
  • lists files that have been sent to quarantine.
  • provides recovery options for one or many files at a time.
  • shows the status of files being recovered.
  • and more.

To run this tool, use the command 

$ nci-file-expiry -h

Which will return:

From here you can choose to dive even deeper into these options by adding --help to the command line, for example:

$ nci-file-expiry recover --help

This gives a breakdown of what commands need to be entered for each tool. In this case, we can see below that to recover a file we need the Universally Unique IDentifier, or UUID, or the quarantined file we wish to recover. 

If the target directory does not exist, you may have to create that manually and also need to specify the filename it should be saved as. Example:

$ nci-file-expiry recover ca7547b7-2066-4c24-9feb-864b23c038de /scratch/xxy/ab1234/dir1/dirTwo/
If your target path includes special characters (like spaces, round brackets, etc.), please add escape characters in the path or include the entire path in double quotes.

Batch recovery

In case there are lots of files to be retrieved, consider using the batch-recover option. 

This, however, involves a few steps to setup:

  1. Get the full list of files in quarantine and redirect it to a file:
    $ nci-file-expiry list-quarantined > list_of_files.txt
  2. Edit this file to remove the entries for files that you are no longer interested in, including the top header line

  3. Use a combination of Unix utilities like awk, grep, to the output file to contain: <UUID> <path/to/quarantined/file>.

    Example:  awk '{print $1,$6}' < list_of_files.txt > final_list_of_files.txt
    Final entries in this file should be something similar to: 

    395087c4-0555-4521-8fe4-6899796aef93 /scratch/xxy/ab1234/abc/xyz/FDS6.7.7.tar.gz397465c4-0532-4957-8af4-6899678eaf39 /scratch/xxy/ab1234/dec/a3f/random.txt
    Note that if the target directory does not exist, you have to create that manually since nci-file-expiry will not create the path for you.

    The best thing of recovering lots of quarantined files is to recreate all the directories before submitting the recovery requests. Then you can recover all of the objects without issues.

    The --json flag to list-quarantined will generate JSON output which will have "type" field which is "d" for directories and "-" for files.  You will need to parse the output to select the directory and file names. For each directory name, simply run

    $ mkdir -p <directory name>

    to create the missing directories.

    For batch-recover option, target paths with special characters do not need to be escaped. 

    The --json flag to `list-quarantined` can be used for processing in a scripting language like Python.

  4. When this file is ready, just submit it for processing: 

    $ nci-file-expiry batch-recover final_list_of_files.txt
    It will scan the file for errors and let you know if there are any (this will be mostly target path related).
  5. Depending on the number of files in the request, it may take some time to recover all of them. You can keep checking the status of the recovery by occasionally running: 

    $ nci-file-expiry status

Authors: Yue Sun, Mohsin Ali, Andrew Johnston, Andrey Bliznyuk, Javed Shaikh