Page tree

Overview

DFTB+, version 23.1 Currently under construction

How to use 


Please load the appropriate modulefile with the command
$ module load dftbplus/23.1

For information regarding modules, please so our software applications guide.

(This page is under construction) 

Two versions of major binary executables of the package are provided in $DFTBPLUS_ROOT/bin

  1. Parallel OMP version (e.g., dftb+waveplotmodes and other binaries with names having no extension)
  2. Hybrid parallel MPI-OMP version of some major binaries (dftb+.mpimodes.mpi, phonons.mpi, setupgeom.mpi and waveplot.mpi)

The OMP executable dftb+ was built with support of the following optional features:

  • transport (NEQF) calculations; 
  • the Poisson-solver in non-transport calculations
  • the Extended Tight Binding Hamiltonian (xTB,  with support of the tblite library);
  • the PLUMED  MD tools (linked to external PLUMED 2.8.2 library);
  • the Implicitly Restarted Arnoldi Method (linked to external ARPACK-NG 3.7.0 library);
  • the DFT-D3 dispersion correction (linked to external SDFTD3 library);
  • repulsive corrections (via the ChIMES library);
  • the MAGMA GPU-enabled eigensolver (linked to external MAGMA 2.6.2 library)

The MPI-OMP executable dftb+.mpi was built with support of the following optional features:

  • transport (NEQF) calculations;
  • the Poisson-solver in non-transport calculations
  • the Extended Tight Binding Hamiltonian (xTB,  with support of the tblite library);
  • the PLUMED  MD tools (linked to external PLUMED 2.8.2 library);
  • the DFT-D3 dispersion correction (linked to external SDFTD3 library);
  • repulsive corrections (via the ChIMES library);
  • eigensolvers from the ELSI library (linked to external ELSI 2.9.1 library, includes ELPA, OMM, PEXSI and NTPoly eigensolver methods and GPU support).

The ELSI library does not support the OMP parallelism, therefore it is recommended to set  the number of OMP threads as equal to 1 (e.g, by adding -x OMP_NUM_THREADS=1 to the mpirun command) if either of ELSI eigensolvers (ELPA, OMM, PEXSI or NTPoly) requested for calculation. 

Below we give some examples of PBS scripts to run a DFTB+ job using various available regimes.

Use of OMP binary dftb+. The example submission script file dftb+.pbs below uses a fictitious project a99, asks for 16 cpus, 1 hour of wall clock time, 16GB of memory and 1GB of fast jobfs disk space.

#!/bin/bash
 
#PBS -P a99
#PBS -l ncpus=16
#PBS -l mem=16GB
#PBS -l jobfs=1GB
#PBS -l walltime=01:00:00
#PBS -l wd
 
# Load module, always specify version number.
module load dftbplus/23.1
 
# Must include `#PBS -l storage=scratch/ab12+gdata/yz98` if the job
# needs access to `/scratch/ab12/` and `/g/data/yz98/`
 
export OMP_NUM_THREADS=$PBS_NCPUS
dftb+ > output

The input file 'dftb_in.bsd' must be located in the directory from which the job has been submitted.  To submit the job to the queuing system:

$ qsub dftb+.pbs

Use of MPI-OMP binary dftb+.mpi for pure MPI calculations.  This  regime is recommended if ELPA, OMM, PEXSI, or NTPoly eigensolvers from the ELSI library requested in input. The example submission script file dftb+.pbs below uses a fictitious project a99, asks for 96 cpus, 1 hour of walltime, 180 GiB of memory and 10 GiB of the jobfs disk space.

#!/bin/bash
 
#PBS -P a99
#PBS -l ncpus=96
#PBS -l mem=180GB
#PBS -l jobfs=10GB
#PBS -l walltime=01:00:00
#PBS -l wd
 
# Load module, always specify version number.
module load dftbplus/23.1
 
# Must include `#PBS -l storage=scratch/ab12+gdata/yz98` if the job
# needs access to `/scratch/ab12/` and `/g/data/yz98/`
 
mpirun -np $PBS_NCPUS -x OMP_NUM_THREADS=1 dftb+.mpi > output

Use of MPI-OMP binary dftb+.mpi for hybrid MPI-OMP calculations. This regime is recommended wherever possible for multi-node calculations. The mpirun command would be more complicated in this case versus pure MPI case above, as extra options must be provided in the mpirun command line  to define partitioning of allocated CPUs between MPI ranks and number of OpenMP threads within each MPI rank . The example submission script file dftb+.pbs below uses a fictitious project a99, asks for 96 cpus, 1 hour of walltime, 180 GiB of memory and 10 GiB of the jobfs disk space. One MPI rank per NUMA node is requested with the OMP parallelism utilized within each NUMA node.

 #!/bin/bash
 
#PBS -P a99
#PBS -l ncpus=96
#PBS -l mem=180GB
#PBS -l jobfs=10GB
#PBS -l walltime=01:00:00
#PBS -l wd
 
# Must include `#PBS -l storage=scratch/ab12+gdata/yz98` if the job
# needs access to `/scratch/ab12/` and `/g/data/yz98/`
 
module load dftbplus/23.1
RANKS=$((PBS_NNODES*PBS_NCI_NUMA_PER_NODE))
THREADS=$PBS_NCI_NCPUS_PER_NUMA
MPI_OPTIONS="-map-by ppr:1:numa:PE=$THREADS --bind-to core -x OMP_NUM_THREADS=THREADS -report-bindings"
RUN_EXE="dftb+.mpi"
RUN_CMD="mpirun -np $RANKS ${MPI_OPTIONS} ${RUN_EXE}"
 
echo "Job started on ${HOSTNAME}"
echo "The MPI command is: ${RUN_CMD}"
 
${RUN_CMD}

In the above  example of the submission script, we illustrate allocation of one NUMA node per MPI rank. In the normal PBS queue on gadi, each node of 48 cpus has 4 NUMA nodes with 12 cpus per NUMA node. Within each NUMA node, the job will use the OpenMP parallelism. For the normal queue on gadi,  the script above gives 8 MPI ranks and 12 OMP threads used within each MPI rank. It is up to user to decide on how many MPI ranks and OpenMP threads to use in a particular job. Some preliminary testings may be advised to find the MPI-OMP setup giving the best performance.

The input file must bear name dftb_in.hsd. It requires to provide path to the Slater-Kostner parameter sets. A large number of Slater-Kostner parameter sets is available in directory /apps/dftbplus/slako/. Also, If OMP threads used in DFTB+ MPI-OMP calculations, the DFTB+ input file must include the Parallel Section with UseOmpThreads set as .true. The program default is UseOmpThreads = .false.

The user's manual "manual.pdf" for DFTB+ and ELSI eigensolver library "elsi_manual_v2.9.1.pdf" are provided in directory $DFTBPLUS_ROOT. A large set of documentation including the manual, recipes and tutorials is available online on the developers web-site.

Authors: Ivan Rostov
  • No labels