DFTB+ is fast and efficient stand-alone implementation of the Density Functional based Tight Binding (DFTB) method.
It was developed in Paderborn in the group of Professor Frauenheim and is the successor of the old DFTB and Dylax codes.
For more information, see the official DFTB+ site.
Users of the program are advised to sign up on the DFTB-Plus-User Maillist. Also, you may find answered questions of other DFTB+ users in the Mail list archive.
Please load the appropriate modulefile with the command
$ module load dftbplus/22.2
For more details on using modules see our software applications guide.
Two versions of major binary executables of the package are provided in $DFTBPLUS_ROOT/bin
:
dftb+
, waveplot
, modes and other binaries with names having no extension
)dftb+.mpi
, waveplot.mpi
, modes.mpi and other binaries with .mpi extension
)The OMP executable dftb+
was built with support of the following optional features:
The MPI-OMP executable dftb+.mpi
was built with support of the following optional features:
The ELSI library does not support the OMP parallelism, therefore it is recommended to set the number of OMP threads as equal to 1 (e.g, by adding -x OMP_NUM_THREADS=1
to the mpirun command) if either of ELSI eigensolvers (ELPA, OMM, PEXSI or NTPoly) requested for calculation.
Below we give some examples of PBS scripts to run a DFTB+ job using various available regimes.
Use of OMP binary dftb+. The example submission script file dftb+.pbs
below uses project a99, asks for 16 CPUs, 1 hour of walltime, 16 GiB of memory and 1 GiB of fast jobfs disk space.
#!/bin/bash #PBS -P a99 #PBS -l ncpus=16 #PBS -l mem=16GB #PBS -l jobfs=1GB #PBS -l walltime=01:00:00 #PBS -l wd # Load module, always specify version number. module load dftbplus/22.2 # Must include `#PBS -l storage=scratch/ab12+gdata/yz98` if the job # needs access to `/scratch/ab12/` and `/g/data/yz98/` export OMP_NUM_THREADS=$PBS_NCPUS dftb+ > output
The input file dftb_in.bsd
must be located in the directory from which the job has been submitted. To submit the job to the queuing system:
$ qsub dftb+.pbs
Use of MPI-OMP binary dftb+.mpi for pure MPI calculations.
This regime is recommended if ELPA, OMM, PEXSI, or NTPoly eigensolvers from the ELSI library requested in input. The example submission script file dftb+.pbs
below uses project a99, asks for 96 CPUs, 1 hour of walltime, 180 GiB of memory and 10 GiB of the jobfs disk space.
#!/bin/bash #PBS -P a99 #PBS -l ncpus=96 #PBS -l mem=180GB #PBS -l jobfs=10GB #PBS -l walltime=01:00:00 #PBS -l wd # Load module, always specify version number. module load dftbplus/22.2 # Must include `#PBS -l storage=scratch/ab12+gdata/yz98` if the job # needs access to `/scratch/ab12/` and `/g/data/yz98/` mpirun -np $PBS_NCPUS -x OMP_NUM_THREADS=1 dftb+.mpi > output
Use of MPI-OMP binary dftb+.mpi for hybrid MPI-OMP calculations.
This regime is recommended wherever possible for multi-node calculations. The mpirun
command would be more complicated in this case versus pure MPI case above, as extra options must be provided in the mpirun
command line to define partitioning of allocated CPUs between MPI ranks and number of OpenMP threads within each MPI rank . The example submission script file dftb+.pbs
below uses project a99, asks for 96 CPUs, 1 hour of walltime, 180 GiB of memory and 10 GiB of the jobfs disk space. One MPI rank per NUMA node is requested with the OMP parallelism utilized within each NUMA node.
#!/bin/bash #PBS -P a99 #PBS -l ncpus=96 #PBS -l mem=180GB #PBS -l jobfs=10GB #PBS -l walltime=01:00:00 #PBS -l wd # Must include `#PBS -l storage=scratch/ab12+gdata/yz98` if the job # needs access to `/scratch/ab12/` and `/g/data/yz98/` module load dftbplus/21.2 RANKS=$((PBS_NNODES*PBS_NCI_NUMA_PER_NODE)) THREADS=$PBS_NCI_NCPUS_PER_NUMA MPI_OPTIONS="-map-by ppr:1:numa:PE=$THREADS --bind-to core -x OMP_NUM_THREADS=THREADS -report-bindings" RUN_EXE="dftb+.mpi" RUN_CMD="mpirun -np $RANKS ${MPI_OPTIONS} ${RUN_EXE}" echo "Job started on ${HOSTNAME}" echo "The MPI command is: ${RUN_CMD}" ${RUN_CMD}
In the above example of the submission script, we illustrate allocation of one NUMA node per MPI rank. In the normal PBS queue on Gadi, each node of 48 cpus has 4 NUMA nodes with 12 cpus per NUMA node. Within each NUMA node, the job will use the OpenMP parallelism.
For the normal queue on Gadi, the script above gives 8 MPI ranks and 12 OMP threads used within each MPI rank. It is up to user to decide on how many MPI ranks and OpenMP threads to use in a particular job. Some preliminary testings may be advised to find the MPI-OMP setup giving the best performance.
The input file must bear name dftb_in.hsd. It requires to provide path to the Slater-Kostner parameter sets. A large number of Slater-Kostner parameter sets is available in directory /apps/dftbplus/slako/. Also, If OMP threads used in DFTB+ MPI-OMP calculations, the DFTB+ input file must include the Parallel Section with UseOmpThreads set as .true. The program default is UseOmpThreads = .false.
The user's manual "manual.pdf" for DFTB+ and ELSI eigensolver library "elsi_manual_v2.9.1.pdf" are provided in directory $DFTBPLUS_ROOT. A large set of documentation including the manual, recipes and tutorials is available online on the developers website.