Page tree

Data-movers

Gadi has 6 six dedicated data-mover nodes that handle exactly that, moving data to and from the system at a high speed. 

These data-mover nodes have the domain name 'gadi-dm.nci.org.au', as seen below, and you can use this when moving data to and from the system.

Using these data-movers ensures that large data transfers are kept off the login nodes, reducing impact for other users who are accessing them. 

There are several ways to initiate a transfer and which one you use will depend on your operating system and the size of the file you wish to transfer.

SCP (Secure Copy Protocol) 

scp is a quick and reliable way to transfer small amounts of data from your local machine to Gadi through the data-mover nodes.

rsync

rsync is also available and is the prefered method for transferring large amounts of data. Using rsync means that if your transfer is interrupted in some way, you can simply resume it where it left off. This one feature alone could save you hours of time. 

3rd party clients

Clients such as FileZilla, WinSCP, and MobaXterm have the ability to complete transfers simply by dragging and dropping, negating the need to use command lines. 

scp

From local system →  Gadi 

To start a scp transfer through Gadi's data-mover nodes, you can run the command,

$ scp <filename> <user>@gadi-dm.nci.org.au: <destination>

Notice that the domain is no longer gadi.nci.org.au, but gadi-dm.nci.org.au, as you are logging into a data-mover node instead of a login node. 

Replace the filename with the file you wish to transfer, including the file extension e.g. .sh, .pdf, and use your username as the login. The destination should be replaced with location that you want the file to be placed in, like the following for an example: 

$ scp testfile.sh <user>@gadi-dm.nci.org.au:/home/900/<user>

You will then be prompted for your password, once entered, your file transfer will begin. 

From Gadi → local system

If you want to do the opposite, move a file from Gadi to your local system, you simply need to reverse the prompt to reflect this, like so

$ scp <user>@gadi-dm.nci.org.au:/home/900/<user>/testfile.sh ./<destination>

To find a listing of all the options available to you with scp, e.g. which option to use to transfer a folder/directory, use the command

$ man scp

 to print a list of all of the variables that you can use.

rsync

If you are transferring a larger file and want to add a layer of protection to the transfer, using rsync will give you the ability to resume a transfer that has been interrupted. As with SCP, reversing your pathways will change which direction the transfer is happening, either uploading from your local device or downloading from Gadi.

rsync also allows greater control over the transfer and exactly what happens with the file by placing options into the command line, such as 

$ rsync -vP  testfile.sh <user>@gadi-dm.nci.org.au:/home/900/<user>

the -vP in the command line means that the output file will be verbose and preserve all of its attributes such as timestamps. -P is a great way to get an exact copy of your file.

To find a listing of all the options available to you with rsync, e.g. which option to use to transfer a folder/directory, use the command

$ man rsync

to print a list of all of the variables that you can use. 

If something happens and your transfer doesn't complete, simply run the exact command again and the transfer will pick up where it left off.

3rd Party Applications

Third party applications can be used as a way of visualising file transfers instead of using command line. 

FileZilla, WinSCP, MobaXterm, and many more will allow you to drag and drop you files, negating the need to remember and use command lines. These can be a very handy tool to add to your HPC toolbox and we recommend that you become acquainted with them.

Please refer to the specific programs user manual to learn how to utilise their transfer features.  

Using wget to Download

If you need a file that is already hosted online, the wget command is a quick and simple way to download this file directly. 

To do this, first locate the url of the file you need, you can usually do this by right clicking the file and selecting 'Copy link address'. Once you've copied the link, run the command

$ wget <url>

Simply replace <url> with the link that you copied and the system will immediately start downloading your file and place it into your home directory. 

Large transfers via copyq

For large transfer, over 500 GiB for instance, it is better to submit the transfer as a job to the copyq queue. To read the process of how to achieve this, follow this link.


Compressed Files

A useful way of minimising transfer time is compressing files so that the overall size of the package is smaller. This is also done when you have a lot of small files that can be compressed into one larger file and transferred with less effort. These files can't be used in their current state and need to be unpacked first. 

Depending on what type of file you are trying to access, you will need to unpack them in different ways. Use the list below to work out what command you need to run to unpack your file. 

ExtensionCommand

.tar

tar -xvf <filename>.tar

.tgz or tar.gz tar -xzvf <filename.tgz
.gzgunzip <filename.gz
.zipunzip <filename>.zip
.bz2bunzip <filename>.bz2
Authors: Yue Sun, Andrew Johnston