Clark et al. (2026) Comment on Williams (2025): “Friends don't let friends use NSE or KGE for hydrologic model accuracy evaluation: A rant with data and suggestions for better practice”

Identification

Journal: Environmental Modelling & Software
Year: 2026
Date: 2026-01-10
Authors: Martyn P. Clark, Wouter J.M. Knoben, Diana Spieler, Gaby J. Gründemann, Cyril Thébault, Nicolás Vasquéz, Andy Wood, Yalan Song, Chaopeng Shen, Shaun T. Carney, Katie van Werkhoven
DOI: 10.1016/j.envsoft.2026.106869

Research Groups

Not specified in the provided text.

Short Summary

This commentary critically evaluates Williams (2025)'s recommendation to abandon Nash–Sutcliffe Efficiency (NSE) and Kling–Gupta Efficiency (KGE) for hydrologic model evaluation, arguing that replacing them with error-based metrics like Root Mean Squared Error (RMSE) does not resolve underlying issues and overlooks the value of skill scores in standardized benchmarking.

Objective

To discuss three main limitations in Williams (2025)'s paper regarding hydrologic model evaluation metrics and argue against abandoning skill scores in favor of error-based metrics.

Study Configuration

Spatial Scale: Not applicable; this is a commentary on evaluation metrics rather than a study applying models to specific geographical areas.
Temporal Scale: Not applicable; this is a commentary on evaluation metrics rather than a study applying models over specific time periods.

Methodology and Data

Models used: Not applicable; this is a commentary discussing the theoretical and practical implications of different evaluation metrics (e.g., Nash–Sutcliffe Efficiency, Kling–Gupta Efficiency, Root Mean Squared Error, Mean Absolute Error, percent bias).
Data sources: Not applicable; this paper is a critical discussion of existing literature and evaluation practices, not an analysis of new data.

Main Results

Williams (2025) gives insufficient attention to the broader literature on hydrologic model evaluation, weakening its recommendations.
Replacing skill scores (NSE, KGE) with error-based metrics (RMSE, MAE) does not resolve the fundamental issue of conflating spatial variations in model accuracy with variations in flow variability.
Williams (2025) overlooks the significant value of NSE and KGE in establishing standardized test environments for consistent model comparison.
The proposed path by Williams (2025) could lead the community away from more constructive approaches to advance hydrologic model evaluation methods.

Contributions

Provides a balanced and critical response to Williams (2025), challenging the recommendation to abandon widely used skill scores.
Reaffirms the importance of considering existing literature and the context of metric selection in hydrologic model evaluation.
Highlights the continued utility of skill scores like NSE and KGE for standardized benchmarking and model comparison, despite their known limitations.
Argues for a more nuanced approach to improving hydrologic model evaluation rather than a simple replacement of metrics.

Funding

Not specified in the provided text.

Citation

@article{Clark2026Comment,
  author = {Clark, Martyn P. and Knoben, Wouter J.M. and Spieler, Diana and Gründemann, Gaby J. and Thébault, Cyril and Vasquéz, Nicolás and Wood, Andy and Song, Yalan and Shen, Chaopeng and Carney, Shaun T. and Werkhoven, Katie van},
  title = {Comment on Williams (2025): “Friends don't let friends use NSE or KGE for hydrologic model accuracy evaluation: A rant with data and suggestions for better practice”},
  journal = {Environmental Modelling & Software},
  year = {2026},
  doi = {10.1016/j.envsoft.2026.106869},
  url = {https://doi.org/10.1016/j.envsoft.2026.106869}
}

Original Source: https://doi.org/10.1016/j.envsoft.2026.106869