Page tree

On this page

Overview

nccmp stands for NetCDF Compare. It compares two NetCDF files bitwise, semantically or with a user defined tolerance (absolute or relative percentage). Parallel comparisons are done in local memory without requiring temporary files. Highly recommended for regression testing scientific models or datasets in a test-driven development environment. Features include:


  • Multi-threaded if posix threads are supported on your system.
  • Prints differences and their locations (C or Fortran indexing). Format precision is customisable and allows hex.
  • Specific variable inclusion or exclusion.
  • Specific groups by short or full (absolute) names, or all (including recursive groups).
  • Metadata and/or data comparisons.
  • Global attributes compared with or without the history attribute.
  • Exits when first difference is found or optionally continues to process all variables and/or attributes by force, or up to a global or per-variable count.
  • Attribute exclusion when comparing metadata.
  • Bitwise compare or with tolerances (absolute or relative).
  • Header padding contents and attribute comparisons for NetCDF 3 classic/64bit file formats.
  • NaN values can be treated as equal (in case you use NaN's as grid masks).
  • Compare or dump encodings for one or both files: checksumming, chunking, compression, endianness, format, shuffling, and header-pad sizes.
  • Support all NetCDF 4/HDF atomic types (char/text, schar, uchar, short, ushort, int, uint, int64, uint64, float, double, string) and user-defined types (enumeration, compound, opaque blob, variable-length array) and nesting.
  • Compare asymmetric atomic data type values, within variable-length arrays too. For example, a variable can be uint in the first file, and int64 in the second.
  • Compare similarly named compound fields and ignore differently named or missing fields, so compound schemas may have variation in field type, field order and field existence.
  • Asymmetric enum value and datatype semantics. Enum identifiers are compared instead of their encoded values, so your dataset schema can evolve flexibly without worrying about enum order, size, type, or numeric values, although any metadata differences will be reported.
  • Print data difference statistics (count, sum, absolute sum, min, max, range, mean, stdev) for numeric variables and compound fields. In quiet mode, only this summary table will print.
  • Lightweight. Comparing multi-gigabyte files will only consume several megabytes of arena memory.


More information: https://gitlab.com/remikz/nccmp

Usage

You can check the versions installed in Gadi with a module query:

$ module avail nccmp

We normally recommend using the latest version available and always recommend to specify the version number with the module command:

$ module load nccmp/1.8.5.0

nccmp/1.8.5.0 is built using netcdf/4.7.1.

For more details on using modules see our modules help guide at https://opus.nci.org.au/display/Help/Environment+Modules.

Start an interactive PBS job with the following command on Gadi. It requests 1 CPU core, 2 GiB memory, and 8 GiB local disk on a compute node on Gadi from the normal queue for its exclusive access for 30 minutes against the project a00. It also requests the system to enter the working directory once the job is started. To change the number of CPU cores, memory, or jobfs required, simply modify the appropriate PBS resource requests in the qsub command below according to the information available at https://opus.nci.org.au/display/Help/Queue+Structure. Note that if your application does not work in parallel, setting the number of CPU cores to 1 and changing the memory and jobfs accordingly is required to prevent the compute resource waste.

Also note that you must include -l storage=scratch/ab12+gdata/yz98 to the qsub command below if the job needs access to /scratch/ab12/ and /g/data/yz98/. Details on https://opus.nci.org.au/display/Help/PBS+Directives+Explained.

$ qsub -I -P a00 -q normal -l ncpus=1,mem=2GB,jobfs=8GB,walltime=00:30:00,wd

When the interactive job starts on Gadi, execute the followings commands:

# Load module, always specify version number.
$ module load nccmp/1.8.5.0

# Run nccmp application
$ nccmp [Options] file1 [file2]

For more information about nccmp command's Options see nccmp's help information:

# Load module, always specify version number.
$ module load padb/3.2

$ nccmp --help