Page tree

Overview

DFTB+ is fast and efficient stand-alone implementation of the Density Functional based Tight Binding (DFTB) method.

It was developed in Paderborn in the group of Professor Frauenheim and is the successor of the old DFTB and Dylax codes.

For more information, see the official DFTB+ site.

Users of the program are advised to sign up on the DFTB-Plus-User Maillist. Also, you may find answered questions of other DFTB+ users in the Mail list archive.

How to use


To use DFTB+ V21.1, please load the appropriate dftbplus modulefile with the command

$ module load dftbplus/21.1

For more details on using modules see our software applications guide.

Three major binary executables of the package are dftb+, waveplot, and modes. There are also some auxiliary binaries, see content of $DFT_BASE/bin. Most of binaries are provided in two versions: OMP version (files with no file extension) and hybrid MPI-OMP version ( files with .mpi extension). The hybrid MPI-OMP regime is preferred for large multi-node calculations on gadi.

The major dftb+ binary has  also the third version (dftb+.mpi-elsi), which is MPI version linked to ELSI 2.7.1 library providing support to additional eigensolvers (ELPA , OMM , PEXSI and NTPoly). Since ELSI does not support OMP, dftb+.mpi-elsi is pure MPI binary, therefore OMP_NUM_THREADS=1 must be used in jobs using dftb+.mpi-elsi. All three versions of the dftb+ binary were built with support of transport calculations. The OMP version of executable (binary dftb+) is supportive of GPU calculation via MAGMA eigensolver.

To facilitate the use of the binaries for beginners, we offer an auxiliary script file run.sh which decides what version of binary (OMP or MPI-OMP) to run, set all OMP and MPI environment settings up based on your PBS number of requested CPUs. However, some parallel options must be provided via the input file. We leave it on user to provide this options when necessary.

If you find the  MPI settings for the job (can be seen  in first lines of the job log)  is not what you want, you can make your own settings via direct use of mpirun command, i.e. without use of the run.sh script.

The script input arguments are:

  • %1 is the binary name, default is dftb+
  • %2 is the number of MPI threads. Default is equal to the number of the nodes.
  • %3 is the number of OMP threads per MPI thread. Default is lesser of the  number of cores per node and the number of requested cpus through PBS.

In most cases, the first argument (binary name) is enough. The script will set the number of MPI and OMP threads for you based on available PBS resources.

Here is an example of a parallel DFTB+ job run under PBS. The example file dftb+.pbs using project a99, asks for 16 CPUs, 1 hour of walltime, 16 GiB of memory and 1 GiB of fast jobfs disk space.

#!/bin/bash
 
#PBS -P a99
#PBS -l ncpus=16
#PBS -l mem=16GB
#PBS -l jobfs=1GB
#PBS -l walltime=01:00:00
#PBS -l wd
 
# Load module, always specify version number.
module load dftbplus/21.1
 
# Must include `#PBS -l storage=scratch/ab12+gdata/yz98` if the job
# needs access to `/scratch/ab12/` and `/g/data/yz98/`
 
dftb+ > output

The input file dftb_in.bsd must be located in the directory from which the job has been submitted.  To submit the job to the queuing system:

$ qsub dftb+.pbs

The dftb+.mpi-elsi binary  (DFTB+ executable binary with ELSI eigensolver, MPI-enabled) can be employed via standard mpirun command. i.e.

#!/bin/bash
 
#PBS -P a99
#PBS -l ncpus=16
#PBS -l mem=16GB
#PBS -l jobfs=1GB
#PBS -l walltime=01:00:00
#PBS -l wd
 
# Load module, always specify version number.
module load dftbplus/21.1
 
# Must include `#PBS -l storage=scratch/ab12+gdata/yz98` if the job
# needs access to `/scratch/ab12/` and `/g/data/yz98/`

mpirun -np $PBS_NCPUS dftb+.mpi-elsi > output

The dftb+.mpi binary  (DFTB+ executable binary,  supportive of hybrid MPI-OpenMP parallelism, SDFTD3 and PLUMED) can be employed via mpirun command which provide environment settings necessary to define partitioning of allocated CPUs between MPI ranks and number of OpenMP threads within each MPI rank . See example below.

#!/bin/bash
 
#PBS -P a99
#PBS -l ncpus=96
#PBS -l mem=92GB
#PBS -l jobfs=1GB
#PBS -l walltime=01:00:00
#PBS -l wd
 
# Must include `#PBS -l storage=scratch/ab12+gdata/yz98` if the job
# needs access to `/scratch/ab12/` and `/g/data/yz98/`

module load dftbplus/21.1
RANKS=$((PBS_NNODES*PBS_NCI_NUMA_PER_NODE))
NTHREADS=$((PBS_NCPUS/RANKS))
MPI_OPTIONS="-map-by node:SPAN,PE=$NTHREADS --bind-to core -x OMP_NUM_THREADS=NTHREADS -report-bindings"
RUN_EXE="dftb+.mpi"
RUN_CMD="mpirun -np $RANKS ${MPI_OPTIONS} ${RUN_EXE}"
 
echo "Job started on ${HOSTNAME}"
echo "The MPI command is: ${RUN_CMD}"
 
${RUN_CMD}

In the above example, we illustrate allocation of one NUMA node per MPI rank. In the normal queue on gadi, each node of 48 CPUs has 4 NUMA nodes with 12 cpus per NUMA node.

Within each NUMA node, the job will use the OpenMP parallelism. For the normal queue on gadi, in the example above, we get at the end RANKS=8 and OMP_NUM_THREADS=12. But it is up to user to decide on how many MPI ranks and OpenMP threads to use in a particular job. Some preliminary testings may be advised to find the MPI-OMP setup giving the best performance.

The input file must bear name dftb_in.hsd. It is required to provide path to the Slater-Kostner parameter sets. A large number of Slater-Kostner parameter sets is available in directory /apps/dftbplus/slako/.

A sample input file, PBS submission script, and output files for DFTB+ and Waveplot programs are available in directory /apps/dftbplus/21.1/first-calc. Read file read_me inside the directory on protocol of running DFTB+ and Waveplot on NCI machines.

A number of python DFTB+ utilities to process DFTB+ results, is provided, see content of /apps/dftbplus/21.1/bin. The python utilities require preload of python/3.7.4.

The user's manual, manual.pdf, for DFTB+ and included utility programs can be found in directory $DFTBPLUS_ROOT/binA large set of documentation including the manual, recipes and tutorials is available online on the developers website.

Authors: Ivan Rostov, Mohsin Ali
  • No labels