Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Equivalently, the function `addprocs` starts the same cluster in Julia script as shown below. In side a ARE jupyter notebook, the following code add $PBS_NCPUS workers to the local cluster and collect their process ID to the master process.

Code Block
titlestart.local.cluster.jl
using Distributed
addprocs(parse(Int64,ENV["PBS_NCPUS"]))
pmap(x->getpid(),1:nworkers())

...

If it is a multi-node job, start the cluster as the following. It uses passwordless ssh login to start Julia worker processes across all nodes available in the job.  The following example works through both the command line interface on Gadi node and the graphic user interface in an ARE JupyterLab session.

Code Block
using Distributed
home=ENV["HOME"]
nodes=unique(split(read(open(ENV["PBS_NODEFILE"],"r"),String)))
ncpus_per_node = parse(Int64,ENV["PBS_NCI_NCPUS_PER_NODE"])
machines=[(split(node,".gadi.")[1],ncpus_per_node) for node in nodes]
exename=joinpath(ENV["NCI_DATA_ANALYSIS_BASE"],"bin","julia")
addprocs(machines;tunnel=true,sshflags=`-o PubKeyAuthentication=yes -o StrictHostKeyChecking=no -o IdentityFile=$home/.ssh/juliakey`,exename=exename)

...