Wang et al. (2026) Cross-platform super-resolution: A diffusion model approach for enhancing satellite imagery with aerial data
Identification
- Journal: International Journal of Applied Earth Observation and Geoinformation
- Year: 2026
- Date: 2026-01-06
- Authors: Zhe Wang, Carmen Galaz García, Benjamin S. Halpern
- DOI: 10.1016/j.jag.2025.105046
Research Groups
- National Center for Ecological Analysis and Synthesis, University of California, Santa Barbara, USA
- Bren School of Environmental Science and Management, University of California Santa Barbara, USA
Short Summary
This study investigates cross-platform super-resolution to enhance 3-meter PlanetScope satellite imagery with 60-centimeter NAIP aerial data using a diffusion model (SR3). It reveals a significant "domain gap" between data sources, leading to lower cross-platform performance (best PSNR 16.85 dB) compared to single-source super-resolution (PSNR 27.28 dB), and demonstrates that models trained from scratch outperform fine-tuned models, suggesting negative transfer.
Objective
- To evaluate the feasibility of applying the Super-Resolution via Iterative Refinement (SR3) diffusion model to upscale 3-meter PlanetScope satellite imagery to the 60-centimeter resolution of National Agriculture Imagery Program (NAIP) aerial imagery, addressing the novel challenge of cross-platform super-resolution.
Study Configuration
- Spatial Scale:
- Input imagery: 3-meter spatial resolution (PlanetScope).
- Target imagery: 60-centimeter spatial resolution (NAIP).
- Study area: Santa Barbara County, California, United States.
- Test area: An out-of-sample region outside Santa Barbara County, California, United States.
- Image patch size: 128 pixels × 128 pixels.
- Temporal Scale:
- NAIP data: 2020, collected during agricultural growing seasons (every two years).
- PlanetScope data: May and June 2020, near-daily imagery.
Methodology and Data
- Models used:
- Super-Resolution via Iterative Refinement (SR3) model (a Denoising Diffusion Probabilistic Model - DDPM-based architecture).
- Adam optimizer.
- Bicubic interpolation (for baseline comparison).
- Data sources:
- National Agriculture Imagery Program (NAIP) aerial imagery: 60-centimeter spatial resolution, Red, Green, Blue, and Near-infrared (NIR) spectral bands.
- PlanetScope satellite imagery: 3-meter spatial resolution, Red, Green, Blue, and Near-infrared (NIR) spectral bands (selected from 8 available bands).
- Training data: 1,221,097 paired patches (NAIP 60 cm and downsampled NAIP 3 m) and 1,173,523 paired patches (NAIP 60 cm and PlanetScope 3 m).
- Test data: 7,857 PlanetScope 3 m patches and corresponding NAIP 60 cm reference patches from an out-of-sample area.
Main Results
- The SR3 model successfully upscaled 3-meter PlanetScope imagery to 60-centimeter resolution, achieving a fivefold enhancement.
- The model demonstrated robust performance on single-source data (downsampled NAIP to NAIP), yielding a Peak Signal-to-Noise Ratio (PSNR) of 27.28 dB and a Structural Similarity Index Measure (SSIM) of 0.72.
- Cross-platform super-resolution (PlanetScope to NAIP) performance was substantially lower, with the best PSNR of 16.85 dB and SSIM of 0.42 (Model 9, trained from scratch with 100% data), highlighting a significant "domain gap."
- Models trained from scratch consistently outperformed fine-tuned models for cross-platform super-resolution, indicating a "negative transfer" effect where pre-training on homogeneous data was detrimental.
- PSNR for scratch-trained models increased from 15.55 dB (25% data) to 16.85 dB (100% data).
- PSNR for fine-tuned models decreased from 16.53 dB (25% data) to 15.90 dB (100% data).
- The Normalized Difference Vegetation Index (NDVI) proved to be a more effective performance indicator for environmental applications than standard computer vision metrics (PSNR, SSIM), showing better preservation of critical spectral information.
- NDVI-derived PSNR and SSIM values were higher than those calculated directly from spectral bands (e.g., Model 2 achieved the highest NDVI-derived PSNR of 19.76 dB).
- Pearson correlation coefficients (R) for NDVI values were consistently strong across most models (e.g., Model 7 achieved 0.76).
- Visual quality assessment showed Model 9 (trained from scratch, 100% data) produced sharper boundaries and more coherent spatial patterns for features like roads and trees, closely resembling NAIP reference images.
Contributions
- Conducted the first systematic investigation into the data requirements for applying a deep learning model for super-resolution across disparate aerial (NAIP) and satellite (PlanetScope) imaging platforms.
- Integrated the advanced diffusion model-based Super-Resolution via Iterative Refinement (SR3) model for cross-platform super-resolution, specifically upscaling PlanetScope imagery using NAIP data.
- Explored the impact of data volume and transfer learning strategies, uncovering a counter-intuitive "negative transfer" effect where training from scratch outperformed fine-tuning for cross-platform super-resolution.
- Augmented traditional evaluation metrics with the Normalized Difference Vegetation Index (NDVI) to provide a more domain-relevant assessment of model performance for environmental applications, demonstrating enhanced spectral utility.
Funding
- NASA grant #80NSSC23K1561
Citation
@article{Wang2026Crossplatform,
author = {Wang, Zhe and García, Carmen Galaz and Halpern, Benjamin S.},
title = {Cross-platform super-resolution: A diffusion model approach for enhancing satellite imagery with aerial data},
journal = {International Journal of Applied Earth Observation and Geoinformation},
year = {2026},
doi = {10.1016/j.jag.2025.105046},
url = {https://doi.org/10.1016/j.jag.2025.105046}
}
Original Source: https://doi.org/10.1016/j.jag.2025.105046