Specialised Environments

Page tree

Introduction

ERA5 is a climate reanalysis dataset produced by the European Centre for Medium-Range Weather Forecasts (ECMWF). It provides a comprehensive and consistent record of the Earth's climate system, including variables like temperature, wind, humidity, pressure, and many others. ERA5 combines observations from various sources, such as satellites, ground-based weather stations, and ocean buoys, with advanced numerical models to generate a detailed representation of the atmosphere, ocean, land, and sea ice.

ERA5 is often used for weather and climate analysis, weather forecasting, and climate research. It is considered one of the most advanced and accurate reanalysis datasets available.

NCI's ERA5 datasets are accessible but require direct accesss to NCI to analyse the data (through project rt52) and use our intake-esm indexes (through project dk92). To simplify the process of searching and loading ERA5 datasets for quick analysis and visualization, you can utilize the ERA5 catalog file provided by our NCI intake-esm package.


You must have connected to project dk92 to access NCI intake-esm catalog files under /g/data/dk92/catalog/v2/esm. 

You must have connected to project rt52 to load the ERA5 dataset itself.

Operations

You can conduct the following operations via NCI ARE JupyterLab session or in a python script.

Opening Catalog Files

First of all, you need to open a catalog file via the intake open_esm_datastore method.

import intake
data_catalog = intake.open_esm_datastore("/g/data/dk92/catalog/v2/esm/era5-rt52/catalog.json")

It contain several catalog fields as listed below.

era5-rt52 catalog with 1581 dataset(s) from 710059 asset(s):


unique
path710059
file_type1
product13
variable306
stream8
levtype3
time_range972
derived_variable0

You can obtain a list of unique values for each column, as demonstrated below. Some ERA5 variables may have different variants. For instance, the '2t' variable has several variants, such as reanalysis, monthly averaged, and more. These variants can be distinguished by parsing the file paths. To streamline the workflow, we have introduced a 'product' column that encompasses the pertinent components of the file path, making it possible to uniquely identify each variant for every variable.

script
data_catalog.df["variable"].unique()
data_catalog.df["stream"].unique()data_catalog.df["product"].unique()data_catalog.df["levtype"].unique()data_catalog.df["time_range"].unique()
output
array(['mtnlwrf', 'cvh', 'rhoao', 'vimad', 'alnid', 'nsss', 'stl2',
       'ilspf', 'mtdwswrf', 'cvl', 'flsr', 'shts', 'tcc', 'lsm',
       'msdwlwrf', 'wss', 'i10fg', 'mmtss', 'mpts', 'lai-hv', 'lsrr',
       'vioze', 'str', 'mbld', 'mper', 'lblt', 'msshf', 'tciw', 'ci',
       'vithee', 'p1ww', 'vithed', 'mwd', 'vilwn', 'sst', 'ttrc', 'msror',
       'tvh', 'mwd3', 'shww', 'phioc', 'swh1', 'vimat', 'sshf', 'phiaw',
       'slhf', 'crr', 'mslhf', 'msnswrfcs', 'mwd2', 'skt', 'mdww', 'dwps',
       'dwi', 'csfr', '100v', 'viwvn', 'istl3', 'stl4', 'dl', 'cdww',
       'csf', 'inss', 'fal', 'megwss', 'ishf', 'viked', 'alnip', 'swvl1',
       '10si', 'ewss', 'cape', 'lmlt', 'smlt', 'swvl4', 'aluvp', 'tcrw',
       'zust', 'lict', 'aluvd', 'msdwswrfcs', 'lssfr', 'mtnswrfcs',
       'strd', 'tisr', 'wmb', 'cbh', 'v10n', 'mgwd', 'vilwd', 'lspf',
       'viiwe', 'e', 'vikee', 'msl', 'viken', 'chnk', 'tco3', 'asn',
       'sdfor', 'mtnlwrfcs', 'msnlwrf', 'msdwlwrfcs', 'lcc', 'mvimd',
       'ssrd', 'kx', 'msnlwrfcs', 'bld', 'vima', 'cin', 'viman', 'mcc',
       'p2ps', 'istl1', 'swh', 'es', 'vitoee', 'dndza', 'sf', 'mtnswrf',
       'tsrc', 'viec', 'lai-lv', 'mser', 'vige', 'lsp', 'lshf', 'lsf',
       'fdir', 'msr', 'ust', 'cl', 'mpww', 'vit', 'tsn', 'vst', '10v',
       'magss', 'dwww', 'vigd', 'iews', 'mp1', 'mwp2', 'viozd', 'mtpr',
       'tp', 'swvl2', 'lmld', 'totalx', 'bfi', 'viiwn', '10u', 'stl1',
       'blh', 'mgws', 'isor', 'istl4', 'vithen', 'vipie', 'mwp3', 'z',
       'strc', 'mcpr', 'viozn', 'wstar', 'ro', 'swh3', 'mwp', 'mngwss',
       'sro', 'msdrswrfcs', 'p2ww', 'ie', 'vimd', 'msdwswrf', 'viwvd',
       'sdor', 'wsp', 'vilwe', 'ssrc', 'vitoed', 'mlspr', 'tvl', 'dctb',
       'ptype', 'wdw', 'tclw', 'mlspf', 'msnswrf', 'mssror', 'msmr',
       'anor', 'mer', 'wind', 'rsn', 'mwp1', '2d', 'slor', 'vithe', 'gwd',
       'mlssr', 'tcwv', 'mdts', 'ttr', 'vign', 'msqs', 'dndzn', 'tcsw',
       'swh2', 'tcslw', 'msdwuvrf', 'lgws', 'viiwd', 'u10n', 'src',
       'hmax', 'ssro', 'mntss', 'mwd1', 'stl3', 'licd', 'uvb', 'istl2',
       'mp2', 'strdc', 'msdrswrf', '100u', 'vimae', 'vitoen', 'sp', 'fsr',
       'tcw', 'mcsr', 'cdir', 'vike', 'vitoe', 'p1ps', 'tauoc', 'pev',
       'vipile', 'wsk', 'metss', 'tmax', 'sd', 'tplt', 'tplb', 'ssrdc',
       'deg0l', '2t', 'cp', 'slt', 'viwve', 'tsr', 'swvl3', 'ssr', 'mror',
       'hcc', 'mxtpr', 'acwh', 'mx2t', 'ltlt', 'pp1d', '10fg', 'awh',
       'arrc', 'mntpr', 'mn2t', 'w', 'pv', 'r', 'o3', 'cswc', 'vo',
       'ciwc', 'd', 't', 'q', 'clwc', 'u', 'dmc', 'erc', 'danger_risk',
       'kbdi', 'ic', 'isi', 'dsr', 'dc', 'ffmc', 'bi', 'bui', 'fwi',
       'fdi', 'sc', 'Snowf', 'Rainf', 'LWdown', 'Qair', 'SWdown', 'Wind',
       'Tair', 'PSurf', 'cc', 'crwc', 'v', 'ASurf'], dtype=object)
array(['mnth', 'wamd', 'wamo', 'moda', 
'wave', 'oper', 'cru-gpcc', 'cru'], dtype=object)
array(['era5-monthly-averaged-by-hour', 'era5-monthly-averaged',
       'era5-reanalysis', 'era5t-reanalysis',
       'era5-preliminary-monthly-averaged-by-hour',
       'era5-preliminary-reanalysis', 'era5-1-monthly-averaged',
       'era5-1-reanalysis', 'era5-1-monthly-averaged-by-day',
       'era5-derived-cems-v3-1', 'era5-derived-cems-v4-0',
       'era5-derived-wfde5-v1-1', 'era5-preliminary-monthly-averaged'],
      dtype=object)
array(['sfc', 'pl', 'na'], dtype=object)
array(['19781101-19781130', '20070501-20070531', '19770201-19770228',
       '19620301-19620331', '20050201-20050228', '19670401-19670430',
       '19811001-19811031', '20200901-20200930', '19730501-19730531',
       '20221101-20221130', '20160701-20160731', '19871101-19871130',
       '20180901-20180930', '19881201-19881231', '19840301-19840331',
       '19990801-19990831', '20030501-20030531', '19890501-19890531',
       '19850301-19850331', '19991201-19991231', '20080301-20080331',
       '19941201-19941231', '19770901-19770930', '20040501-20040531',
...
'20230520-20230520', '20230503-20230503', '20230525-20230525', '20230507-20230507', '20230407-20230407', '20230426-20230426'], dtype=object)

Searching the Catalog Data

Now you can conduct search based on the above unique values, such as below ( note the special character ^ means all entries whose time range starts with 1959 or 1960)

catalog_subset = data_catalog.search(
    variable=["t","u","v","r","z","10u","10v","100u","100v","2t","sp","msl","tcwv"],
    product="era5-reanalysis",
    time_range=["^1959/*","^1960/*"],
    levtype=["sfc"])

It will produce a catalog subset

era5-rt52 catalog with 9 dataset(s) from 216 asset(s):


unique
path216
file_type1
product1
variable9
stream1
levtype1
time_range24
derived_variable0

Loading Datasets

You can load the dataset of this catalog subset directly as below

dsets = catalog_subset.to_dataset_dict()

The 'dsets' contains the following filtered variables.

{'f.era5-reanalysis.100u.oper.sfc': <xarray.Dataset>
 Dimensions:    (longitude: 1440, latitude: 721, time: 17544)
 Coordinates:
   * longitude  (longitude) float32 -180.0 -179.8 -179.5 ... 179.2 179.5 179.8
   * latitude   (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0
   * time       (time) datetime64[ns] 1959-01-01 ... 1960-12-31T23:00:00
     u100       (time, latitude, longitude) float32 dask.array<chunksize=(744, 721, 1440), meta=np.ndarray>
 Data variables:
     *empty*
 Attributes:
     Conventions:                     CF-1.6
     license:                         Licence to use Copernicus Products: http...
     summary:                         ERA5 is the fifth generation ECMWF atmos...
     intake_esm_vars:                 ['100u']
     intake_esm_attrs:file_type:      f
     intake_esm_attrs:product:        era5-reanalysis
     intake_esm_attrs:variable:       100u
     intake_esm_attrs:stream:         oper
     intake_esm_attrs:levtype:        sfc
     intake_esm_attrs:_data_format_:  netcdf
     intake_esm_dataset_key:          f.era5-reanalysis.100u.oper.sfc,
 'f.era5-reanalysis.10u.oper.sfc': <xarray.Dataset>
 Dimensions:    (longitude: 1440, latitude: 721, time: 17544)
 Coordinates:
   * longitude  (longitude) float32 -180.0 -179.8 -179.5 ... 179.2 179.5 179.8
   * latitude   (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0
   * time       (time) datetime64[ns] 1959-01-01 ... 1960-12-31T23:00:00
     u10        (time, latitude, longitude) float32 dask.array<chunksize=(744, 721, 1440), meta=np.ndarray>
 Data variables:
     *empty*
 Attributes:
     Conventions:                     CF-1.6
     license:                         Licence to use Copernicus Products: http...
     summary:                         ERA5 is the fifth generation ECMWF atmos...
     intake_esm_vars:                 ['10u']
     intake_esm_attrs:file_type:      f
     intake_esm_attrs:product:        era5-reanalysis
     intake_esm_attrs:variable:       10u
     intake_esm_attrs:stream:         oper
     intake_esm_attrs:levtype:        sfc
     intake_esm_attrs:_data_format_:  netcdf
     intake_esm_dataset_key:          f.era5-reanalysis.10u.oper.sfc,
 'f.era5-reanalysis.10v.oper.sfc': <xarray.Dataset>
 Dimensions:    (longitude: 1440, latitude: 721, time: 17544)
 Coordinates:
   * longitude  (longitude) float32 -180.0 -179.8 -179.5 ... 179.2 179.5 179.8
   * latitude   (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0
   * time       (time) datetime64[ns] 1959-01-01 ... 1960-12-31T23:00:00
     v10        (time, latitude, longitude) float32 dask.array<chunksize=(744, 721, 1440), meta=np.ndarray>
 Data variables:
     *empty*
 Attributes:
     Conventions:                     CF-1.6
     license:                         Licence to use Copernicus Products: http...
     summary:                         ERA5 is the fifth generation ECMWF atmos...
     intake_esm_vars:                 ['10v']
     intake_esm_attrs:file_type:      f
     intake_esm_attrs:product:        era5-reanalysis
     intake_esm_attrs:variable:       10v
     intake_esm_attrs:stream:         oper
     intake_esm_attrs:levtype:        sfc
     intake_esm_attrs:_data_format_:  netcdf
     intake_esm_dataset_key:          f.era5-reanalysis.10v.oper.sfc,
 'f.era5-reanalysis.sp.oper.sfc': <xarray.Dataset>
 Dimensions:    (longitude: 1440, latitude: 721, time: 17544)
 Coordinates:
   * longitude  (longitude) float32 -180.0 -179.8 -179.5 ... 179.2 179.5 179.8
   * latitude   (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0
   * time       (time) datetime64[ns] 1959-01-01 ... 1960-12-31T23:00:00
 Data variables:
     sp         (time, latitude, longitude) float32 dask.array<chunksize=(744, 721, 1440), meta=np.ndarray>
 Attributes:
     Conventions:                     CF-1.6
     license:                         Licence to use Copernicus Products: http...
     summary:                         ERA5 is the fifth generation ECMWF atmos...
     intake_esm_vars:                 ['sp']
     intake_esm_attrs:file_type:      f
     intake_esm_attrs:product:        era5-reanalysis
     intake_esm_attrs:variable:       sp
     intake_esm_attrs:stream:         oper
     intake_esm_attrs:levtype:        sfc
     intake_esm_attrs:_data_format_:  netcdf
     intake_esm_dataset_key:          f.era5-reanalysis.sp.oper.sfc,
 'f.era5-reanalysis.msl.oper.sfc': <xarray.Dataset>
 Dimensions:    (longitude: 1440, latitude: 721, time: 17544)
 Coordinates:
   * longitude  (longitude) float32 -180.0 -179.8 -179.5 ... 179.2 179.5 179.8
   * latitude   (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0
   * time       (time) datetime64[ns] 1959-01-01 ... 1960-12-31T23:00:00
 Data variables:
     msl        (time, latitude, longitude) float32 dask.array<chunksize=(744, 721, 1440), meta=np.ndarray>
 Attributes:
     Conventions:                     CF-1.6
     license:                         Licence to use Copernicus Products: http...
     summary:                         ERA5 is the fifth generation ECMWF atmos...
     intake_esm_vars:                 ['msl']
     intake_esm_attrs:file_type:      f
     intake_esm_attrs:product:        era5-reanalysis
     intake_esm_attrs:variable:       msl
     intake_esm_attrs:stream:         oper
     intake_esm_attrs:levtype:        sfc
     intake_esm_attrs:_data_format_:  netcdf
     intake_esm_dataset_key:          f.era5-reanalysis.msl.oper.sfc,
 'f.era5-reanalysis.2t.oper.sfc': <xarray.Dataset>
 Dimensions:    (longitude: 1440, latitude: 721, time: 17544)
 Coordinates:
   * longitude  (longitude) float32 -180.0 -179.8 -179.5 ... 179.2 179.5 179.8
   * latitude   (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0
   * time       (time) datetime64[ns] 1959-01-01 ... 1960-12-31T23:00:00
     t2m        (time, latitude, longitude) float32 dask.array<chunksize=(744, 721, 1440), meta=np.ndarray>
 Data variables:
     *empty*
 Attributes:
     Conventions:                     CF-1.6
     license:                         Licence to use Copernicus Products: http...
     summary:                         ERA5 is the fifth generation ECMWF atmos...
     intake_esm_vars:                 ['2t']
     intake_esm_attrs:file_type:      f
     intake_esm_attrs:product:        era5-reanalysis
     intake_esm_attrs:variable:       2t
     intake_esm_attrs:stream:         oper
     intake_esm_attrs:levtype:        sfc
     intake_esm_attrs:_data_format_:  netcdf
     intake_esm_dataset_key:          f.era5-reanalysis.2t.oper.sfc,
 'f.era5-reanalysis.100v.oper.sfc': <xarray.Dataset>
 Dimensions:    (longitude: 1440, latitude: 721, time: 17544)
 Coordinates:
   * longitude  (longitude) float32 -180.0 -179.8 -179.5 ... 179.2 179.5 179.8
   * latitude   (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0
   * time       (time) datetime64[ns] 1959-01-01 ... 1960-12-31T23:00:00
     v100       (time, latitude, longitude) float32 dask.array<chunksize=(744, 721, 1440), meta=np.ndarray>
 Data variables:
     *empty*
 Attributes:
     Conventions:                     CF-1.6
     license:                         Licence to use Copernicus Products: http...
     summary:                         ERA5 is the fifth generation ECMWF atmos...
     intake_esm_vars:                 ['100v']
     intake_esm_attrs:file_type:      f
     intake_esm_attrs:product:        era5-reanalysis
     intake_esm_attrs:variable:       100v
     intake_esm_attrs:stream:         oper
     intake_esm_attrs:levtype:        sfc
     intake_esm_attrs:_data_format_:  netcdf
     intake_esm_dataset_key:          f.era5-reanalysis.100v.oper.sfc,
 'f.era5-reanalysis.z.oper.sfc': <xarray.Dataset>
 Dimensions:    (longitude: 1440, latitude: 721, time: 17544)
 Coordinates:
   * longitude  (longitude) float32 -180.0 -179.8 -179.5 ... 179.2 179.5 179.8
   * latitude   (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0
   * time       (time) datetime64[ns] 1959-01-01 ... 1960-12-31T23:00:00
 Data variables:
     z          (time, latitude, longitude) float32 dask.array<chunksize=(744, 721, 1440), meta=np.ndarray>
 Attributes:
     Conventions:                     CF-1.6
     license:                         Licence to use Copernicus Products: http...
     summary:                         ERA5 is the fifth generation ECMWF atmos...
     intake_esm_vars:                 ['z']
     intake_esm_attrs:file_type:      f
     intake_esm_attrs:product:        era5-reanalysis
     intake_esm_attrs:variable:       z
     intake_esm_attrs:stream:         oper
     intake_esm_attrs:levtype:        sfc
     intake_esm_attrs:_data_format_:  netcdf
     intake_esm_dataset_key:          f.era5-reanalysis.z.oper.sfc,
 'f.era5-reanalysis.tcwv.oper.sfc': <xarray.Dataset>
 Dimensions:    (longitude: 1440, latitude: 721, time: 17544)
 Coordinates:
   * longitude  (longitude) float32 -180.0 -179.8 -179.5 ... 179.2 179.5 179.8
   * latitude   (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0
   * time       (time) datetime64[ns] 1959-01-01 ... 1960-12-31T23:00:00
 Data variables:
     tcwv       (time, latitude, longitude) float32 dask.array<chunksize=(744, 721, 1440), meta=np.ndarray>
 Attributes:
     Conventions:                     CF-1.6
     license:                         Licence to use Copernicus Products: http...
     summary:                         ERA5 is the fifth generation ECMWF atmos...
     intake_esm_vars:                 ['tcwv']
     intake_esm_attrs:file_type:      f
     intake_esm_attrs:product:        era5-reanalysis
     intake_esm_attrs:variable:       tcwv
     intake_esm_attrs:stream:         oper
     intake_esm_attrs:levtype:        sfc
     intake_esm_attrs:_data_format_:  netcdf
     intake_esm_dataset_key:          f.era5-reanalysis.tcwv.oper.sfc}

  • No labels