Page tree

The details of our intake-esm software is here. Once the intake-esm is loaded in your environment you can start to use our intake-esm data catalogs.  NCI current provides intake-esm data catalogs for the following CMIP5/CMIP6 data collections on NCI:

Our intake-esm catalogue files are all located on the filesystem under /g/data/dk92/catalog/v2/esm.  Note that you must have connected to project dk92 to access these.


First of all, you need to open a catalog file via the intake open_esm_datastore method.

Open Catalogue File
import intake
cmip6 = intake.open_esm_datastore("/g/data/dk92/catalog/v2/esm/cmip6-oi10/catalog.json")

Calling the loaded esm_datastore, gives an overview over its content.

Get catalogue head

The datastore contains a df class in the type pf pandas DataFrame.

Get catalogue head

Using `cmip6.df.columns` lists all the columns/keys that can be used to search the data.

Get all columns

The method unique() lists all the unique values for each column as a dictionary. You can search any values for each column.

List unique keys per column
values_dict = cmip6.unique()

Let's select a subset by passing the search() method with a combination of columns. The returned results shows that the subset contains 18 files crossing multiple columns.

Search keywords
subset =['MPI-ESM-1-2-HAM', 'NorESM2-LM'],

Now you can open the dataset directly via the to_dataset_dict() API. It is recommended to start a Dask cluster to accelerate it.

For example, you can quickly set up a local Dask cluster with a single node resources as below. 

Start Dask cluster
from dask.distributed import Client, LocalCluster
cluster = LocalCluster()
client = Client(cluster)

Now you can invoke the to_dataset_dict() API and it returns a dictionary listing all the datasets in our subset

Print dataset metadata
dset_dict = subset.to_dataset_dict()

Finally you can simply load a dataset using its key

Access the dataset
ds = dset_dict['']


  • No labels