Page tree

Pangu-Weather is a deep learning-based global weather forecasting system. It is the first deep learning-based model to outperform any Numerical Weather Prediction (NWP) model in terms of Correlation Coefficient (ACC) and Root Mean Squared Error (RMSE). It is a huge step forward for the Deep learning-based weather prediction models.

Conventional NWP models use partial differential equations to incorporate the physics rules into the weather simulation, which are numerically expensive. Performing a single simulation for 10 days worth of forecast can take hours on hundreds of nodes; that limits the number of times the simulations can be run. Furthermore, NWP models are complex and depend on multiple parameter tuning. Often, expert inputs are required to run the process smoothly. Thus, they are both expensive and time consuming to run.

A promising alternative is to use deep learning-based weather prediction and several models have been put forward in recent years. Deep learning-based models can produce inference much faster than an NWP-based one by a long distance. It boils down to the trade-off among the model training complexity, prediction accuracy and resolution. Pangu-Weather is the first of its kind to outperform a state-of-the-art NWP model in terms of prediction accuracy for the high-resolution output in significant cases. 

Interested users can refer to the following Nature article for further details:

Pre-trained Models

Pangu-Weather is trained with the European Centre for Medium-Range Weather Forecasts (ECMWF) ERA5 data, a total of 43 years (1979-2021) worth of data is used. Among them, 39 years (1979-2017) of data is used for training, the year 2019 data for validation and the rest (2018, 2020, 2021) for testing. The entire training dataset is about 60TB and each model is trained for 100 epochs, which takes 15/16 days on 192 V-100 GPUs. It is expensive in terms of computing resources. To help the researchers, the authors have released a couple of pre-trained models for different prediction intervals. 

In NCI, the models can be found in the following locations.

  •   6-hour model: /g/data/wb00/Pangu-Weather/model/pangu_weather_6.onnx 
  • 24-hour model: /g/data/wb00/Pangu-Weather/model/pangu_weather_24.onnx


One needs to join the wb00 and dk92 projects to access the files and modules.
Please visit the following page to join NCI projects:


Figure.1 shows a schematic diagram of the 3D earth-specific transformer used by Panqu-Weather. NCI Pangu-weather interference dataset has been created from the ERA5 data, which contains high spatial resolution reanalysis data at 0.25 x 0.25 degrees. Thus, each data point at each time stamp and pressure level is a 2D array of 1440 x 721 dimensions. The inference process requires two sources of input data, one for upper-air variables and another for surface variables.The upper-air variables have 13 pressure levels and create a 13 x 1440 x 721 x 5 dimension data cube when combined. On the other hand, the surface level variable input consists of a 1440 x 721 x 4 dimension data cube. They are shown on the left hand-side of the diagram.  

Fig. 1: Pangu-Weather Vistion Transformer Architecture. 

Pangu-Weather uses a variant of a vision transformer modified to input and output multidimensional weather data. It is possible to train the model with different lead times which had to be set beforehand. As mentioned above the input consists of a high dimensional data cube, therefore, the patch embedding technique is used for dimensionality reduction.

First, the original input is embedded into a C-dimensional latent space. In the case of upper-air variables, data is embedded with a patch size of 2 x 4 x 4, resulting in an embedded data shape of 7 x 360 x 181 x C. On the other hand, a patch size of 4 x 4 is used for the surface variables, resulting in a 360 x 181 x C dimensional data cube. Both, the data cubes combined to create an 8 x 369 x 181 x C data cube, which is then fed to layer 1. Afterwards, the data is passed through an 8-layer encoder and 8-layer decoder architecture. Each encoder-decoder layer is a vision transformer block modified to align with Earth's geometry. The decoder block performs the exact opposite of the patch embedding technique to reconstruct the output. In the end, the final high-rest output is produced from the final decoder block. 

Key findings

In this section, some of the key findings and features of the Pangu-Weather are discussed for the benefit of researchers. 

Prediction accuracy

Figure. 2 shows the comparison between the prediction accuracy and error of the Pangu-Weather and that of the state-of-the-art NWP and AI model. The Integrated Forecasting System (IFS) is the state-of-the-art global NWP system developed by ECMWF, on the other hand, FourCastNet is the state-of-the-art AI weather prediction model. Results show that Pangu-weather prediction can outperform them in a significant number of cases. 

Four plots, on the left side of Figure 2 compare the predictions for the lead time from 6 to 168 hours for three models. In this case, two variables used are 500hPa geopotential (Z500) and 850hPa temperature (T850). Pangu-weather (shown in red) has a higher correlation coefficient (ACC) and lower overall error (RMSE) than other models for all prediction intervals. It may be noted that the FourCastNet (shown in black) has significantly lower accuracy and higher error than both IFS and Pangu-Weather. 

On the right side of Figure 2, four plots compare the month-wise prediction results for IFS and Pangu-Weather. In this case, four variables used are Z500, Q500, T500, and U500. It can be seen that Pangu-Weather (shown in red) has a better correlation coefficient (ACC) for each variable at each month. Thus, Pangu-Weather shows consistently better performance throughout the year. 


Fig. 2: Pangu-Weather outperforms the state-of-the-art NWP and AI model.

Three-day forecast visualization

Figure 3 visualizes the Pangu-Weather's 3-day prediction for two variables: 2m temperature, and 10m wind speed. In each case, input data is the ERA5 from September 1st, 2018 at 00:00 UTC. The 2m temperature data is shown on the top row, while the 10m wind data is shown on the bottom row. One thing to note is that the prediction visualization looks smooth compared to the ground truth. That is because, in deep learning, each point learns relative to other points, and a model tries to average out small changes among the neighbouring points. Thus, transitions appear to be smoother compared to the ground truth. 

Fig. 3: A limitation of the deep learning prediction compared to NWP is that the transition edges are not perfectly defined. 

Extreme weather events 

In addition to prediction, the Pangu-Weather can also be used for extreme weather event tracking. Experimental results show that the Pangu-Weather is better at tracking extreme weather events compared to the current state-of-the-art NWP system. 

Figure 4 shows the path of two Tropical Typhoons in 2018, they are Kong-rey (2018-25) ( and Yutu (2018-26) ( It can be seen that the 48-hour ahead prediction by Pangu-Weather (shown in red) is much closer to the ground truth (shown in black). Whereas, the ECMWF high resolution (HRES) model predicted a different path 48 hours before.

Fig. 4: Extreme weather tracking with Pangu-Weather. 

Pangu-Weather Notebook on ARE

  1. Please, login to the ARE website:
  2. And fill up the following fields: 

    Walltime (hours): <as required>
    Queue: gpuvolta
    Compute Size: 1gpu
    Project: <Your project code>
    Storage: gdata/dk92+gdata/wb00+gdata/rt52+gdata/<your project code>+scratch/<your project code>

    Module directories: /g/data/dk92/apps/Modules/modulefiles
    Modules: NCI-ai-ml/23.10

  3.  Click Launch to start a JupyterLab session.
  4.  Clone the repo to your local file system and open the file "inference.ipynb" in the JupyterLab session.

  • No labels