You are able to use NCI-data-analysis module to manipulate NetCDF files in parallel.
Gadi
Example job script:
#!/bin/bash #PBS -l ncpus=4 #PBS -l mem=16GB #PBS -l jobfs=20GB #PBS -q normal #PBS -P a00 #PBS -l walltime=02:00:00 #PBS -l storage=gdata/dk92+gdata/a00+scratch/a00 #PBS -l wd module use /g/data/fp0/apps/Modules/modulefiles module load NCI-data-analysis/2022.06
mpirun python3 par_nc4_test.py >& output.log |
If you run the below par_nc4_test.py script in the above job script
from mpi4py import MPI import numpy as np from netCDF4 import Dataset comm = MPI.COMM_WORLD # Use the world communicator mpi_rank = comm.Get_rank() # The process ID mpi_size = comm.Get_size() # Total amount of ranks
with Dataset('output.nc','w',parallel=True) as f: d = f.createDimension('dim',mpi_size) v = f.createVariable('var', np.int64, 'dim') v[mpi_rank] = mpi_rank
comm.Barrier()
if (mpi_rank == 0): print(mpi_size,' MPI ranks have finished writing!') |
you will get a NetCDF file named "output.nc" containing outputs from all 4 MPI ranks.
$ ncdump output.nc netcdf output { dimensions: dim = 4 ; variables: int64 var(dim) ; data:
var = 0, 1, 2, 3 ; } |