Introduction
ERA5 is a climate reanalysis dataset produced by the European Centre for Medium-Range Weather Forecasts (ECMWF). It provides a comprehensive and consistent record of the Earth's climate system, including variables like temperature, wind, humidity, pressure, and many others. ERA5 combines observations from various sources, such as satellites, ground-based weather stations, and ocean buoys, with advanced numerical models to generate a detailed representation of the atmosphere, ocean, land, and sea ice.
ERA5 is often used for weather and climate analysis, weather forecasting, and climate research. It is considered one of the most advanced and accurate reanalysis datasets available.
NCI's ERA5 datasets are accessible but require direct accesss to NCI to analyse the data (through project rt52) and use our intake-esm indexes (through project dk92). To simplify the process of searching and loading ERA5 datasets for quick analysis and visualization, you can utilize the ERA5 catalog file provided by our NCI intake-esm package.
You must have connected to project dk92 to access NCI intake-esm catalog files under /g/data/dk92/catalog/v2/esm.
You must have connected to project rt52 to load the ERA5 dataset itself.
Operations
You can conduct the following operations via NCI ARE JupyterLab session or in a python script.
Opening Catalog Files
First of all, you need to open a catalog file via the intake open_esm_datastore method.
import intake data_catalog = intake.open_esm_datastore("/g/data/dk92/catalog/v2/esm/era5-rt52/catalog.json")
It contain several catalog fields as listed below.
era5-rt52 catalog with 1581 dataset(s) from 710059 asset(s):
unique | |
---|---|
path | 710059 |
file_type | 1 |
product | 13 |
variable | 306 |
stream | 8 |
levtype | 3 |
time_range | 972 |
derived_variable | 0 |
You can obtain a list of unique values for each column, as demonstrated below. Some ERA5 variables may have different variants. For instance, the '2t' variable has several variants, such as reanalysis, monthly averaged, and more. These variants can be distinguished by parsing the file paths. To streamline the workflow, we have introduced a 'product' column that encompasses the pertinent components of the file path, making it possible to uniquely identify each variant for every variable.
script | data_catalog.df["variable"].unique() | data_catalog.df["stream"].unique() | data_catalog.df["product"].unique() | data_catalog.df["levtype"].unique() | data_catalog.df["time_range"].unique() |
---|---|---|---|---|---|
output | array(['mtnlwrf', 'cvh', 'rhoao', 'vimad', 'alnid', 'nsss', 'stl2', 'ilspf', 'mtdwswrf', 'cvl', 'flsr', 'shts', 'tcc', 'lsm', 'msdwlwrf', 'wss', 'i10fg', 'mmtss', 'mpts', 'lai-hv', 'lsrr', 'vioze', 'str', 'mbld', 'mper', 'lblt', 'msshf', 'tciw', 'ci', 'vithee', 'p1ww', 'vithed', 'mwd', 'vilwn', 'sst', 'ttrc', 'msror', 'tvh', 'mwd3', 'shww', 'phioc', 'swh1', 'vimat', 'sshf', 'phiaw', 'slhf', 'crr', 'mslhf', 'msnswrfcs', 'mwd2', 'skt', 'mdww', 'dwps', 'dwi', 'csfr', '100v', 'viwvn', 'istl3', 'stl4', 'dl', 'cdww', 'csf', 'inss', 'fal', 'megwss', 'ishf', 'viked', 'alnip', 'swvl1', '10si', 'ewss', 'cape', 'lmlt', 'smlt', 'swvl4', 'aluvp', 'tcrw', 'zust', 'lict', 'aluvd', 'msdwswrfcs', 'lssfr', 'mtnswrfcs', 'strd', 'tisr', 'wmb', 'cbh', 'v10n', 'mgwd', 'vilwd', 'lspf', 'viiwe', 'e', 'vikee', 'msl', 'viken', 'chnk', 'tco3', 'asn', 'sdfor', 'mtnlwrfcs', 'msnlwrf', 'msdwlwrfcs', 'lcc', 'mvimd', 'ssrd', 'kx', 'msnlwrfcs', 'bld', 'vima', 'cin', 'viman', 'mcc', 'p2ps', 'istl1', 'swh', 'es', 'vitoee', 'dndza', 'sf', 'mtnswrf', 'tsrc', 'viec', 'lai-lv', 'mser', 'vige', 'lsp', 'lshf', 'lsf', 'fdir', 'msr', 'ust', 'cl', 'mpww', 'vit', 'tsn', 'vst', '10v', 'magss', 'dwww', 'vigd', 'iews', 'mp1', 'mwp2', 'viozd', 'mtpr', 'tp', 'swvl2', 'lmld', 'totalx', 'bfi', 'viiwn', '10u', 'stl1', 'blh', 'mgws', 'isor', 'istl4', 'vithen', 'vipie', 'mwp3', 'z', 'strc', 'mcpr', 'viozn', 'wstar', 'ro', 'swh3', 'mwp', 'mngwss', 'sro', 'msdrswrfcs', 'p2ww', 'ie', 'vimd', 'msdwswrf', 'viwvd', 'sdor', 'wsp', 'vilwe', 'ssrc', 'vitoed', 'mlspr', 'tvl', 'dctb', 'ptype', 'wdw', 'tclw', 'mlspf', 'msnswrf', 'mssror', 'msmr', 'anor', 'mer', 'wind', 'rsn', 'mwp1', '2d', 'slor', 'vithe', 'gwd', 'mlssr', 'tcwv', 'mdts', 'ttr', 'vign', 'msqs', 'dndzn', 'tcsw', 'swh2', 'tcslw', 'msdwuvrf', 'lgws', 'viiwd', 'u10n', 'src', 'hmax', 'ssro', 'mntss', 'mwd1', 'stl3', 'licd', 'uvb', 'istl2', 'mp2', 'strdc', 'msdrswrf', '100u', 'vimae', 'vitoen', 'sp', 'fsr', 'tcw', 'mcsr', 'cdir', 'vike', 'vitoe', 'p1ps', 'tauoc', 'pev', 'vipile', 'wsk', 'metss', 'tmax', 'sd', 'tplt', 'tplb', 'ssrdc', 'deg0l', '2t', 'cp', 'slt', 'viwve', 'tsr', 'swvl3', 'ssr', 'mror', 'hcc', 'mxtpr', 'acwh', 'mx2t', 'ltlt', 'pp1d', '10fg', 'awh', 'arrc', 'mntpr', 'mn2t', 'w', 'pv', 'r', 'o3', 'cswc', 'vo', 'ciwc', 'd', 't', 'q', 'clwc', 'u', 'dmc', 'erc', 'danger_risk', 'kbdi', 'ic', 'isi', 'dsr', 'dc', 'ffmc', 'bi', 'bui', 'fwi', 'fdi', 'sc', 'Snowf', 'Rainf', 'LWdown', 'Qair', 'SWdown', 'Wind', 'Tair', 'PSurf', 'cc', 'crwc', 'v', 'ASurf'], dtype=object) | array(['mnth', 'wamd', 'wamo', 'moda', | array(['era5-monthly-averaged-by-hour', 'era5-monthly-averaged', 'era5-reanalysis', 'era5t-reanalysis', 'era5-preliminary-monthly-averaged-by-hour', 'era5-preliminary-reanalysis', 'era5-1-monthly-averaged', 'era5-1-reanalysis', 'era5-1-monthly-averaged-by-day', 'era5-derived-cems-v3-1', 'era5-derived-cems-v4-0', 'era5-derived-wfde5-v1-1', 'era5-preliminary-monthly-averaged'], dtype=object) | array(['sfc', 'pl', 'na'], dtype=object) | array(['19781101-19781130', '20070501-20070531', '19770201-19770228', '19620301-19620331', '20050201-20050228', '19670401-19670430', '19811001-19811031', '20200901-20200930', '19730501-19730531', '20221101-20221130', '20160701-20160731', '19871101-19871130', '20180901-20180930', '19881201-19881231', '19840301-19840331', '19990801-19990831', '20030501-20030531', '19890501-19890531', '19850301-19850331', '19991201-19991231', '20080301-20080331', '19941201-19941231', '19770901-19770930', '20040501-20040531', |
Searching the Catalog Data
Now you can conduct search based on the above unique values, such as below ( note the special character ^ means all entries whose time range starts with 1959 or 1960)
catalog_subset = data_catalog.search( variable=["t","u","v","r","z","10u","10v","100u","100v","2t","sp","msl","tcwv"], product="era5-reanalysis", time_range=["^1959/*","^1960/*"], levtype=["sfc"])
It will produce a catalog subset
era5-rt52 catalog with 9 dataset(s) from 216 asset(s):
unique | |
---|---|
path | 216 |
file_type | 1 |
product | 1 |
variable | 9 |
stream | 1 |
levtype | 1 |
time_range | 24 |
derived_variable | 0 |
Loading Datasets
You can load the dataset of this catalog subset directly as below
dsets = catalog_subset.to_dataset_dict()
The 'dsets' contains the following filtered variables.
{'f.era5-reanalysis.100u.oper.sfc': <xarray.Dataset> Dimensions: (longitude: 1440, latitude: 721, time: 17544) Coordinates: * longitude (longitude) float32 -180.0 -179.8 -179.5 ... 179.2 179.5 179.8 * latitude (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0 * time (time) datetime64[ns] 1959-01-01 ... 1960-12-31T23:00:00 u100 (time, latitude, longitude) float32 dask.array<chunksize=(744, 721, 1440), meta=np.ndarray> Data variables: *empty* Attributes: Conventions: CF-1.6 license: Licence to use Copernicus Products: http... summary: ERA5 is the fifth generation ECMWF atmos... intake_esm_vars: ['100u'] intake_esm_attrs:file_type: f intake_esm_attrs:product: era5-reanalysis intake_esm_attrs:variable: 100u intake_esm_attrs:stream: oper intake_esm_attrs:levtype: sfc intake_esm_attrs:_data_format_: netcdf intake_esm_dataset_key: f.era5-reanalysis.100u.oper.sfc, 'f.era5-reanalysis.10u.oper.sfc': <xarray.Dataset> Dimensions: (longitude: 1440, latitude: 721, time: 17544) Coordinates: * longitude (longitude) float32 -180.0 -179.8 -179.5 ... 179.2 179.5 179.8 * latitude (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0 * time (time) datetime64[ns] 1959-01-01 ... 1960-12-31T23:00:00 u10 (time, latitude, longitude) float32 dask.array<chunksize=(744, 721, 1440), meta=np.ndarray> Data variables: *empty* Attributes: Conventions: CF-1.6 license: Licence to use Copernicus Products: http... summary: ERA5 is the fifth generation ECMWF atmos... intake_esm_vars: ['10u'] intake_esm_attrs:file_type: f intake_esm_attrs:product: era5-reanalysis intake_esm_attrs:variable: 10u intake_esm_attrs:stream: oper intake_esm_attrs:levtype: sfc intake_esm_attrs:_data_format_: netcdf intake_esm_dataset_key: f.era5-reanalysis.10u.oper.sfc, 'f.era5-reanalysis.10v.oper.sfc': <xarray.Dataset> Dimensions: (longitude: 1440, latitude: 721, time: 17544) Coordinates: * longitude (longitude) float32 -180.0 -179.8 -179.5 ... 179.2 179.5 179.8 * latitude (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0 * time (time) datetime64[ns] 1959-01-01 ... 1960-12-31T23:00:00 v10 (time, latitude, longitude) float32 dask.array<chunksize=(744, 721, 1440), meta=np.ndarray> Data variables: *empty* Attributes: Conventions: CF-1.6 license: Licence to use Copernicus Products: http... summary: ERA5 is the fifth generation ECMWF atmos... intake_esm_vars: ['10v'] intake_esm_attrs:file_type: f intake_esm_attrs:product: era5-reanalysis intake_esm_attrs:variable: 10v intake_esm_attrs:stream: oper intake_esm_attrs:levtype: sfc intake_esm_attrs:_data_format_: netcdf intake_esm_dataset_key: f.era5-reanalysis.10v.oper.sfc, 'f.era5-reanalysis.sp.oper.sfc': <xarray.Dataset> Dimensions: (longitude: 1440, latitude: 721, time: 17544) Coordinates: * longitude (longitude) float32 -180.0 -179.8 -179.5 ... 179.2 179.5 179.8 * latitude (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0 * time (time) datetime64[ns] 1959-01-01 ... 1960-12-31T23:00:00 Data variables: sp (time, latitude, longitude) float32 dask.array<chunksize=(744, 721, 1440), meta=np.ndarray> Attributes: Conventions: CF-1.6 license: Licence to use Copernicus Products: http... summary: ERA5 is the fifth generation ECMWF atmos... intake_esm_vars: ['sp'] intake_esm_attrs:file_type: f intake_esm_attrs:product: era5-reanalysis intake_esm_attrs:variable: sp intake_esm_attrs:stream: oper intake_esm_attrs:levtype: sfc intake_esm_attrs:_data_format_: netcdf intake_esm_dataset_key: f.era5-reanalysis.sp.oper.sfc, 'f.era5-reanalysis.msl.oper.sfc': <xarray.Dataset> Dimensions: (longitude: 1440, latitude: 721, time: 17544) Coordinates: * longitude (longitude) float32 -180.0 -179.8 -179.5 ... 179.2 179.5 179.8 * latitude (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0 * time (time) datetime64[ns] 1959-01-01 ... 1960-12-31T23:00:00 Data variables: msl (time, latitude, longitude) float32 dask.array<chunksize=(744, 721, 1440), meta=np.ndarray> Attributes: Conventions: CF-1.6 license: Licence to use Copernicus Products: http... summary: ERA5 is the fifth generation ECMWF atmos... intake_esm_vars: ['msl'] intake_esm_attrs:file_type: f intake_esm_attrs:product: era5-reanalysis intake_esm_attrs:variable: msl intake_esm_attrs:stream: oper intake_esm_attrs:levtype: sfc intake_esm_attrs:_data_format_: netcdf intake_esm_dataset_key: f.era5-reanalysis.msl.oper.sfc, 'f.era5-reanalysis.2t.oper.sfc': <xarray.Dataset> Dimensions: (longitude: 1440, latitude: 721, time: 17544) Coordinates: * longitude (longitude) float32 -180.0 -179.8 -179.5 ... 179.2 179.5 179.8 * latitude (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0 * time (time) datetime64[ns] 1959-01-01 ... 1960-12-31T23:00:00 t2m (time, latitude, longitude) float32 dask.array<chunksize=(744, 721, 1440), meta=np.ndarray> Data variables: *empty* Attributes: Conventions: CF-1.6 license: Licence to use Copernicus Products: http... summary: ERA5 is the fifth generation ECMWF atmos... intake_esm_vars: ['2t'] intake_esm_attrs:file_type: f intake_esm_attrs:product: era5-reanalysis intake_esm_attrs:variable: 2t intake_esm_attrs:stream: oper intake_esm_attrs:levtype: sfc intake_esm_attrs:_data_format_: netcdf intake_esm_dataset_key: f.era5-reanalysis.2t.oper.sfc, 'f.era5-reanalysis.100v.oper.sfc': <xarray.Dataset> Dimensions: (longitude: 1440, latitude: 721, time: 17544) Coordinates: * longitude (longitude) float32 -180.0 -179.8 -179.5 ... 179.2 179.5 179.8 * latitude (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0 * time (time) datetime64[ns] 1959-01-01 ... 1960-12-31T23:00:00 v100 (time, latitude, longitude) float32 dask.array<chunksize=(744, 721, 1440), meta=np.ndarray> Data variables: *empty* Attributes: Conventions: CF-1.6 license: Licence to use Copernicus Products: http... summary: ERA5 is the fifth generation ECMWF atmos... intake_esm_vars: ['100v'] intake_esm_attrs:file_type: f intake_esm_attrs:product: era5-reanalysis intake_esm_attrs:variable: 100v intake_esm_attrs:stream: oper intake_esm_attrs:levtype: sfc intake_esm_attrs:_data_format_: netcdf intake_esm_dataset_key: f.era5-reanalysis.100v.oper.sfc, 'f.era5-reanalysis.z.oper.sfc': <xarray.Dataset> Dimensions: (longitude: 1440, latitude: 721, time: 17544) Coordinates: * longitude (longitude) float32 -180.0 -179.8 -179.5 ... 179.2 179.5 179.8 * latitude (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0 * time (time) datetime64[ns] 1959-01-01 ... 1960-12-31T23:00:00 Data variables: z (time, latitude, longitude) float32 dask.array<chunksize=(744, 721, 1440), meta=np.ndarray> Attributes: Conventions: CF-1.6 license: Licence to use Copernicus Products: http... summary: ERA5 is the fifth generation ECMWF atmos... intake_esm_vars: ['z'] intake_esm_attrs:file_type: f intake_esm_attrs:product: era5-reanalysis intake_esm_attrs:variable: z intake_esm_attrs:stream: oper intake_esm_attrs:levtype: sfc intake_esm_attrs:_data_format_: netcdf intake_esm_dataset_key: f.era5-reanalysis.z.oper.sfc, 'f.era5-reanalysis.tcwv.oper.sfc': <xarray.Dataset> Dimensions: (longitude: 1440, latitude: 721, time: 17544) Coordinates: * longitude (longitude) float32 -180.0 -179.8 -179.5 ... 179.2 179.5 179.8 * latitude (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0 * time (time) datetime64[ns] 1959-01-01 ... 1960-12-31T23:00:00 Data variables: tcwv (time, latitude, longitude) float32 dask.array<chunksize=(744, 721, 1440), meta=np.ndarray> Attributes: Conventions: CF-1.6 license: Licence to use Copernicus Products: http... summary: ERA5 is the fifth generation ECMWF atmos... intake_esm_vars: ['tcwv'] intake_esm_attrs:file_type: f intake_esm_attrs:product: era5-reanalysis intake_esm_attrs:variable: tcwv intake_esm_attrs:stream: oper intake_esm_attrs:levtype: sfc intake_esm_attrs:_data_format_: netcdf intake_esm_dataset_key: f.era5-reanalysis.tcwv.oper.sfc}