Kim et al. (2026) SIGMAformer: a spatiotemporal Gaussian mixture correlation transformer for global weather forecasting
Identification
- Journal: npj Climate and Atmospheric Science
- Year: 2026
- Date: 2026-03-23
- Authors: D. H. Kim, Heung-Il Suk
- DOI: 10.1038/s41612-026-01385-w
Research Groups
- Department of Artificial Intelligence, Korea University, Seoul 02841, Republic of Korea
- Department of AI and Data, Korea Environment Institute, Sejong 30147, Republic of Korea
Short Summary
This paper introduces SIGMAformer, a spatiotemporal Gaussian mixture correlation transformer for global multi-station weather forecasting, which integrates a dynamic spatiotemporal correlation (DSTC) mechanism with a Gaussian mixture pattern extractor (GMPE) to adaptively model nonlinear dependencies. The model consistently outperforms state-of-the-art forecasting models in global wind speed and temperature prediction, especially for extreme events, while providing interpretable insights into spatiotemporal patterns.
Objective
- To develop a spatiotemporal forecasting architecture (SIGMAformer) that integrates a Gaussian mixture pattern extractor (GMPE) with a dynamic spatiotemporal correlation (DSTC) mechanism to jointly model nonlinear dependencies and emphasize salient spatiotemporal patterns for accurate and interpretable global multi-station weather forecasting.
Study Configuration
- Spatial Scale: Global, covering 3,850 weather observation stations worldwide. Input data spatial resolution of 0.25 degrees (approximately 30 km by 30 km) and 0.5 degrees.
- Temporal Scale: Hourly average measurements. Historical input window length of 48 hours, forecast window length of 24 hours. Data covers January 1, 2019, to December 31, 2020.
Methodology and Data
- Models used:
- Proposed: SIGMAformer (SpatIotemporal Gaussian Mixture correlAtion transformer)
- Key components of SIGMAformer: Gaussian Mixture Pattern Extractor (GMPE), Dynamic Spatiotemporal Correlation (DSTC) module (including temporal and spatial correlation modeling), series decomposition modules, and a position-wise feedforward network (FFN).
- Baselines: Repeat last time, Repeat last period, ARIMA, GFS (prediction, 0.25°), GFS (reanalysis, 0.5°), ERA5 (prediction, 0.5°), ERA5 (reanalysis, 0.25°), StemGNN (2020), Fedformer (2022), N-HiTS (2023), Corrformer (2023), iTransformer (2024), TimeMixer (2024).
- Data sources:
- Two datasets ("global wind" and "global temperature") derived from the National Centers for Environmental Information (NCEI) global dataset (specifically, GFS data).
- Hourly average wind speed and temperature measurements from 3,850 stations worldwide.
- Dataset split chronologically into training, validation, and test sets using a 7:1:2 ratio.
Main Results
- SIGMAformer achieved state-of-the-art performance in global wind speed and temperature forecasting, consistently outperforming all baseline models in mean squared error (MSE) and mean absolute error (MAE).
- For global wind speed, SIGMAformer recorded an MSE of 3.818 and MAE of 1.269, representing a 1.83% reduction in MSE and 2.68% reduction in MAE compared to Corrformer, and a 19.63% reduction in MSE and 9.52% reduction in MAE compared to FEDformer.
- For global temperature, SIGMAformer achieved an MSE of 7.698 and MAE of 1.887, demonstrating superior accuracy.
- In extreme event forecasting (defined by station-wise 95th percentile thresholds), SIGMAformer exhibited the lowest MAE and False Alarm Ratio (FAR) and the highest Critical Success Index (CSI) for both wind speed and temperature extremes, indicating robust and reliable prediction.
- Ablation studies confirmed the critical role of the Dynamic Spatiotemporal Correlation (DSTC) module, with its removal increasing MSE by up to 7.18% for wind speed and 7.22% for temperature. The spatial correlation component within DSTC was particularly impactful, with its removal leading to MSE increases of up to 10.27% for wind and 14.69% for temperature.
- Visualization of attention patterns revealed that DSTC adaptively focuses on meteorologically meaningful spatiotemporal contexts, capturing distinct physical responses for temperature (e.g., delayed peaks for teleconnections) and wind speed (e.g., short-range sensitivity, Rossby wave patterns).
- SIGMAformer demonstrated stable performance across various architectural hyperparameters, and optimal Gaussian Mixture Pattern Extractor (GMPE) settings were identified (cluster count 𝑘=3, sampling rate 𝑟=2, update interval of 100 iterations).
Contributions
- Proposed a dynamic correlation mechanism that adaptively integrates spatial and temporal dependencies, addressing limitations of static graph structures in existing models.
- Introduced a Gaussian mixture pattern extractor (GMPE) that learns pattern-specific weights from query inputs to capture nonlinear spatiotemporal relationships, emphasizing salient patterns and reducing noise.
- Developed a scalable encoder–decoder architecture that integrates DSTC, GMPE, and series decomposition modules to jointly model short-term fluctuations and long-term atmospheric trends, enhancing both forecasting accuracy and interpretability.
- Achieved state-of-the-art accuracy in global temperature and wind speed forecasting, particularly for extreme events, and provided interpretable insights into dynamic spatiotemporal dependencies.
Funding
- Institute of Information & Communications Technology Planning & Evaluation (IITP) through a grant funded by the Korea government (MSIT) (Grant No. RS-2019-II190079) as part of the Artificial Intelligence Graduate School Program at Korea University.
Citation
@article{Kim2026SIGMAformer,
author = {Kim, D. H. and Suk, Heung-Il},
title = {SIGMAformer: a spatiotemporal Gaussian mixture correlation transformer for global weather forecasting},
journal = {npj Climate and Atmospheric Science},
year = {2026},
doi = {10.1038/s41612-026-01385-w},
url = {https://doi.org/10.1038/s41612-026-01385-w}
}
Original Source: https://doi.org/10.1038/s41612-026-01385-w