Introduction
This page shows how to run state-of-the-art weather forecasting models on Gadi. All model files and data are already on the Gadi file system.
Note
Before you embark on this page, you should already have
a) An ARE session running with the ai-models/2023.11.14
module, and
b) NCI-AI-Models
notebooks are cloned in a Gadi file system location.
If you have not completed either or both steps, then go back to the page: https://opus.nci.org.au/x/CIJnDw, and complete all the steps.
Machine learning models
The machine learning models are GPU-based and hence very fast. They are capable of producing a 7-day forecast within one or two minutes. Currently available models are:
FourCastNetv2-small:It is an enhancement of the Nvidia FourCastNet model with the Spherical Fourier Neural Operator (SFNO) and is capable of producing better results compared to the original FourCastNet model. (https://arxiv.org/pdf/2306.03838.pdf)
Graphcast: A deep learning-based prediction model developed by Google Deepmind. It is trained on ERA5 data at the 0.25-degree resolution. (https://arxiv.org/pdf/2212.12794.pdf)
Pangu-Weather: This is a 3D transformer architecture-based model, which uses spatial dependencies to produce a better prediction result. (https://www.nature.com/articles/s41586-023-06185-3)
In the cloned repository, you will find four notebooks corresponding to four models in addition to the 'LICENSE' and 'README' documents. The four notebooks are:
AI-Models-fourcastnet.ipynb AI-Models-fourcastnet_v2.ipynb AI-Models-graphcast.ipynb AI-Models-panguweather.ipynb
Note
The original FourCastNet model has been deprecated now by the ECMWF, and it does not provide updates anymore.
Notebook contents
Although the four notebooks use four different models, they are of similar structure. All notebooks use functions, which are common. In the section below, the functions are discussed generically; however, they apply to all notebooks equally.
Note
Although the notebooks look similar, they do not use the same functions. Each notebook is specialized for a particular model, so interchanging models and codes in the notebooks may not work.
NCI modules
On the top of the notebooks, you will find the modules to import as shown in Figure 1. They are developed by the NCI software team and contain customized code to run the models from local ERA5 data. If you have configured the Jupyter session correctly then the import statements will run without error, otherwise, go back to the page: https://opus.nci.org.au/x/CIJnDw and follow the instructions.
Figure 1. Import modules
Select start date, time, and lead time
Next, figure 2 shows how to select the prediction start date and time for a model. The cell contains a function called `getw_date_time_lead()
`, when run, three tabs will appear. From the first tab, you can select the year, month, and day to start the prediction. Generally, you can select any data; however, note that one can only select from data that is present on the local disk. NCI data is updated periodically which means the latest data may not be present when you run the code.
From the second tab, one can select one of the four start times for a day. Lastly, the third tab allows you to select a lead time, which determines how far ahead in time the prediction should be made. Once you select all tree values, you are ready to move on to the next cell.
Figure 2. Select a date, time, and lead time.
Run inference with a model
Figure 3 shows how to run an inference session. The `run_date_time()
` function takes parameters, a directory prefix, an ai model name, and a date-time object. The directory prefix determines the disk from where the data is read. The AI model name determines the model to run, in this case, it is Panguweather. Note that you do not need to change any code in the notebooks, there are already four notebooks for four different models. For example, figure 4 shows the same function is used to run the Graphcast model. Lastly, the date-time object is used to pass the start date, time and lead time. All of those values are selected in the previous cell and can be passed to the function directly. All models run on the GPU and will take a minute or two to complete.
Figure 3. Run interference with Panguweather.
Figure 4. Run interference with Graphcast.
Prediction Data
When the above cell is finished, the results will be stored in the variable `pred_data`; one can see the contents of a variable by running the cell as shown in Figure 5 below.
Figure 5. Prediction results.
Surface variable visualization
There are two types of results in the prediction; surface and pressure levels. The function `pw_sfc_names()
` creates a drop-down list of all the variables in the prediction result. One can choose a variable from the list and then run the `plot_sfc()` function, it will produce a side-by-side visualization of the ground truth and surface level variable as shown in Figure 6. Also, the `pw_sfc_names()
` function can be used to select other variables for visualization.
Figure 6. Surface level variable and ground truth visualization.
Pressure level visualization
The pressure levels can also be visualized along with the ground truth. The function `pw_pl_names_levels()
` function will create two tabs, from the first, one can select a variable and pressure levels from the second. At least two pressure levels need to be selected. Figure 7 shows an example of pressure levels and ground truth visualization.
Figure 7. Pressure levels variable and ground truth visualization.