Fengour et al. (2026) A taxonomy-based benchmark of parametric and non-parametric machine learning models for data-driven precipitation prediction in Morocco
Identification
- Journal: Theoretical and Applied Climatology
- Year: 2026
- Date: 2026-03-18
- Authors: Abdelhak EL Fengour, Saloua El Motaki
- DOI: 10.1007/s00704-026-06174-2
Research Groups
- Laboratory of Geomorphology, Environment and Society, Faculty of Letters and Human Sciences, Cadi Ayyad University, UCA, Marrakech, Morocco
- Department of Computer Science, Faculty of Sciences, Abdelmalek Essaâdi University, Tetouan, Morocco
Short Summary
This study introduces a taxonomy-based benchmark of parametric and non-parametric machine learning models for data-driven precipitation prediction in Morocco. It demonstrates that non-parametric models consistently outperform parametric models, effectively capturing the complex, non-linear relationships inherent in highly intermittent and zero-inflated rainfall data.
Objective
- Evaluate the effectiveness of machine learning algorithms in modeling highly intermittent, zero-inflated rainfall observations.
- Propose a structured taxonomy that classifies machine learning algorithms according to their parametric nature.
- Carry out a comparative benchmarking between selected models using standardized error measures.
Study Configuration
- Spatial Scale: Nine strategically selected meteorological stations across diverse topographical and climatic zones in Morocco (e.g., El Gaida at 1056 meters altitude, Dar El Arsa at 138 meters above sea level).
- Temporal Scale: Daily meteorological records spanning from January 1, 1994, to December 31, 2023 (30 years).
Methodology and Data
- Models used:
- Parametric: Multivariate Linear Regression (MLR), Ridge Regression, Lasso Regression, ElasticNet Regression, Multi-Layer Perceptron (MLP).
- Non-parametric: K-Nearest Neighbors (KNN), Decision Trees (DT), Random Forest (RF), Extreme Gradient Boosting (XGBoost), Support Vector Regressor (SVR).
- Data sources: Meteorological records obtained from the Visual Crossing Weather API.
Main Results
- Non-parametric models consistently exhibited superior predictive performance compared to parametric models for precipitation prediction in Morocco.
- Ensemble-based non-parametric models, specifically XGBoost and Random Forest, achieved the highest accuracy, with XGBoost showing the lowest test Mean Absolute Error (MAE) of approximately 0.00043 meters and Root Mean Squared Error (RMSE) of approximately 0.00059 meters.
- Among parametric models, the Multi-Layer Perceptron (MLP) performed best, with a test MAE of approximately 0.00044 meters and RMSE of approximately 0.00060 meters, outperforming all linear regression variants.
- Parametric linear models (MLR, Ridge, Lasso, ElasticNet) were limited by their inherent linear assumptions, struggling to capture the complex, non-linear, intermittent, and zero-inflated characteristics of daily precipitation in semi-arid regions.
- Cloud cover (Pearson correlation coefficient, r ≈ 0.52) and humidity (r ≈ 0.48) showed the strongest positive correlations with precipitation, while temperature variables exhibited weaker or slightly negative correlations.
Contributions
- Introduced a novel taxonomy-driven benchmarking approach for machine learning models in precipitation forecasting, classifying them based on their parametric nature.
- Provided a systematic framework to assess how the structural assumptions of different model types influence predictive performance under heterogeneous climatic conditions.
- Focused on a specific regional context (Morocco, particularly semi-arid inland areas), addressing a gap in the literature that often concentrates on large-scale or global datasets.
- Demonstrated the clear advantages of non-parametric models, especially ensemble methods, for effectively handling highly intermittent, zero-inflated, and non-linear precipitation patterns in semi-arid environments.
Funding
The authors have no funding to declare.
Citation
@article{Fengour2026taxonomybased,
author = {Fengour, Abdelhak EL and Motaki, Saloua El},
title = {A taxonomy-based benchmark of parametric and non-parametric machine learning models for data-driven precipitation prediction in Morocco},
journal = {Theoretical and Applied Climatology},
year = {2026},
doi = {10.1007/s00704-026-06174-2},
url = {https://doi.org/10.1007/s00704-026-06174-2}
}
Original Source: https://doi.org/10.1007/s00704-026-06174-2