The Climate Data Enhanced Virtual Laboratory (DEVL) project has focused on some key components of the infrastructure to manage this massive data archive and make accessible for CMIP6-based research in Australia. It builds on previous Australian e-infrastructure programs, and Climate & Weather Science Lab, and the National Earth Systems Data Collection and Data Services programs. It also supports NCIs leading role in international collaborations, most notably the Earth Systems Grid Federation (ESGF) that provides the international federated capability for CMIP data. The value of this work over a long time has required the funding from various parties including other NCRIS funding programs ANDS, RDS, and NeCTAR (now the Australian Research Data Commons) NCRIS programs. This infrastructure directly supported other major investments from government-funded research from CAWCR (Collaboration for Australian Weather and Climate Research), NESP (National Environmental Science Program) and the ARC CoE for Climate System Science (ARCCSS) and ARC CoE for Climate Extremes (CLEX).
The Climate data at NCI is provided using the principles of FAIR: Findable, Accessible, Interoperable and Reusable. Providing a FAIR data service for such a large and complex data collection exposes significant data management challenges. NCI’s Data Quality Strategy (DQS) delivers data curation practices that permit FAIR standards and interdisciplinary data availability. This service permit streamlined access and analysis of CMIP6 data, enabling efficient state-of-the-art climate science research to be undertaken.
The unique challenges of the CMIP in both size and complexity has required new services to be developed and then made available as well managed operational services. The Climate DEVL has defined and developed the mechanisms for improved accessibility and usability of the data. One example is the need to find what data is available at NCI for use in analysis. This need has been addressed through the NCI’s Metadata Attribute Search (MAS). MAS provides consistent access to the information contained in the climate data collections by harvesting the metadata within the millions of self-describing files that constitute the CMIP data collection. The MAS also underpins a python-based API called CleF, developed by ARCCSS/CLEX, which provides command line search tools for accessing this data. CleF provides researchers with an easy interface to use the ESGF search API to discover what CMIP data has been published that match their specified requirements (experiment, variable, etc.) but is not yet available at NCI. The tool will be extended to enable users to then submit a data download request to add to the NCI CMIP6 replica service.
Another aspect of the Climate DEVL has been to focus a community approach to define the highest priority CMIP6 data needing to be replicated in Australia for local analysis, to permit timely development and publication of scientific research papers analysing the CMIP6 data as it becomes available. The DEVL also supports the evaluation of various model analysis tools, which provides an opportunity for the community to develop standardised workflows for data analysis contributing to the aforementioned research papers.
The Climate DEVL also provides a home for coordinating the ongoing development and availability of training materials necessary for a streamlined user experience. The extensive knowledge and interdisciplinary topics that span CMIP mean that effective training is needed, including face-to-face tutorials, online self-paced learning materials, and trainer training. The combined effort of NCI, CLEX, CSIRO and BoM permit such collaborative training efforts to benefit the entire Australian climate science community.