Page tree
Skip to end of metadata
Go to start of metadata

R is `GNU S' - A language and environment for statistical computing and graphics. R is similar to the award-winning S system, which was developed at Bell Laboratories by John Chambers et al. It provides a wide variety of statistical and graphical techniques (linear and nonlinear modelling, statistical tests, time series analysis, classification, clustering, ...).

R is designed as a true computer language with control-flow constructions for iteration and alternation, and it allows users to add additional functionality by defining new functions. For computationally intensive tasks, C, C++ and Fortran code can be linked and called at run time.

There is an Australian AARNet mirror of the main R web site.

Usage

First you need to decide on the version of the software you want to use. Use

module avail R

to check what versions are available. We normally recommend using the latest version available. For example, to load the 4.0.0 version of R use

module load R/4.0.0

For more details on using modules see our modules help guide.

The following procedure will run R under the PBS queueing system. Assume the usual interactive procedure is to start R and input a file called instructions.R containing the sequence of R commands that you wish to execute.

  • Create a batch job script called Rjob similar to the following example: 

    #!/bin/bash
    #PBS -l wd 
    #PBS -q normal
    #PBS -l walltime=00:02:00,mem=250MB,jobfs=500mb
    module load R/4.0.0
    R --vanilla < input.r > output
  • Make sure this job script is executable and the walltime and vmem limits are correct.
  • Submit the job by issuing the following on the command line

    qsub Rjob
  • This will execute the instructions in instructions.R after starting up R and the output that you would expect to see on the desktop for interactive execution will appear in the file output.
  • Check the files Rjob.e**** and Rjob.o**** for any errors and to see the time consumed.
  • Note the request for scratch space in JOBFS as R uses TMPDIR.

This version of R has been built with the Intel MKl library for dense linear algebra BLAS and LAPACK. If your algorithm is heavily dependent on LAPACK routines you may be able to benefit by running in parallel. An example job script follows:

#!/bin/csh
#PBS  -l wd
#PBS -q normal
#PBS -l walltime=00:20:00,mem=4Gb,ncpus=2
setenv OMP_NUM_THREADS $PBS_NCPUS
module load R/4.0.0
R --vanilla -f input.r > output

To see if it is worth using multiple cpus you should run some timing tests with 1,2,4 up to no more than 16 cpus and check the walltime used. Your problems need to be fairly large to benefit from parallelism.

If you wish to add extra packages such as randomForest you need to load appropriate intel modules. We recommend using the same intel compiler version that were used to build R.

The list of modules that were loaded during the R build in the /apps/R/version/README.nci file. For example, for R/3.3.0, the file is /apps/R/4.0.0/README.nci. There you can see that intel-compiler/2019.5.281 was used. Therefore this is the version that needs to be loaded, as shown below:

module load R/4.0.0
module load intel-compiler/2019.5.281

R
....
>install.packages("randomForest",repos="https://mirror.aarnet.edu.au/pub/CRAN/")
Warning in install.packages("randomForest") :
  ''''''''''''''''''''''''''''''''lib = "/apps/R/4.0.0/lib64/R/library"'''''''''''''''''''''''''''''''' is not writeable
Would you like to create a personal library
''''''''''''''''''''''''''''''''~/R/x86_64-unknown-linux-gnu-library/4.0.0''''''''''''''''''''''''''''''''
to install packages into?  (y/n) y


If you wish to install packages in a different directory from the default ~/R/x86_64-unknown-linux-gnu-library/4.0.0 you need to set the environment variable R_LIBS to the new directory. This will also need to be set every time you use R.


Note, that some packages can not be build with Intel compilers. The problem usually happens when a package using complex variables. In such cases, you need to switch to GNU compilers. This is done by modifying .R/Makevars file in your home directory. Putting the following lines in this file:

CXX=g++
CXX11=g++
CXX14=g++
CC=gcc


will force R to use gcc/g++ instead of icc. Do not forget to comment out these lines (add # symbol in front of each line) after installing that problematic package.