Charjan et al. (2026) A Novel Hybrid Approach To Drought Forecasting: Leveraging Feature Engineering And Ensemble Methods
Identification
- Journal: Scientific Reports
- Year: 2026
- Date: 2026-02-09
- Authors: Ojas Charjan, Krutik Gajbhiye, Janhavi Warhade, Snehlata Wankhade
- DOI: 10.1038/s41598-026-37206-6
Research Groups
- Department of Computer Science and Engineering, Symbiosis Institute of Technology, Nagpur Campus, Symbiosis International (Deemed University), Pune, Maharashtra, India.
Short Summary
This study proposes a novel hybrid drought forecasting model that combines custom feature engineering based on mathematical equations with an ensemble Random Forest Classifier. The model significantly improves drought prediction accuracy, achieving a 98.52% accuracy rate by leveraging physically grounded feature transformations.
Objective
- To develop a robust and data-driven hybrid model for accurate drought forecasting by integrating selective feature engineering with ensemble machine learning techniques.
Study Configuration
- Spatial Scale: Not explicitly defined for the model's application, but the dataset is sourced from "US Drought Meteorological Data" (Kaggle).
- Temporal Scale: Weekly drought prediction, using historical meteorological and environmental parameters.
Methodology and Data
- Models used:
- Custom mathematical equations for feature engineering (Drought Index, Precipitation Impact, Temperature Effect, Soil Moisture Index, Evapotranspiration Ratio).
- Random Forest Classifier (ensemble machine learning model).
- Comparative models: K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Logistic Regression.
- Data sources:
- Kaggle Datasets: "US Drought Meteorological Data" [45].
- Contains 61 columns including precipitation, temperature (at 2 meters, dew point, wet bulb, max/min range), humidity, wind speed (at 10 m, 50 m, max/min), elevation, slope, vegetation index (score), latitude, longitude, and soil quality quadrants (SQ1-SQ7).
- Target variable: Drought level (categorical: No Drought, D0-D4).
Main Results
- The proposed Hybrid Drought Forecasting Model achieved superior performance compared to other models:
- Accuracy: 98.52%
- Precision: 98.58%
- F1-Score: 98.25%
- Regression metrics also showed high performance:
- Mean Absolute Error (MAE): 0.0195
- Coefficient of Determination (R²): 0.9958
- Root Mean Squared Error (RMSE): 0.0215
- In comparison, Logistic Regression, the next best model, achieved 91.65% accuracy, 0.2572 MAE, 0.7807 R², and 1.1259 RMSE.
- Optimal Random Forest configuration was found at 50 estimators and a tree depth of 5, yielding an accuracy of 98%.
- The confusion matrix indicated minimal misclassifications, demonstrating high robustness and generalization capability.
Contributions
- Introduction of a novel hybrid drought forecasting approach that combines deterministic mathematical models (physically grounded feature engineering) with the probabilistic learning capabilities of a Random Forest Classifier.
- Development of five new, interpretable features (Drought Index, Precipitation Impact, Temperature Effect, Soil Moisture Index, Evapotranspiration Ratio) derived from existing meteorological data using mathematical equations.
- Demonstrated significantly enhanced prediction accuracy (98.52%) for drought levels compared to traditional machine learning models, while maintaining interpretability and computational efficiency.
- The research highlights the value of balancing accuracy, interpretability, and lightweight computation in drought prediction.
Funding
- The authors declare that no funding was received from any organization or agency in support of this research.
Citation
@article{Charjan2026Novel,
author = {Charjan, Ojas and Gajbhiye, Krutik and Warhade, Janhavi and Wankhade, Snehlata},
title = {A Novel Hybrid Approach To Drought Forecasting: Leveraging Feature Engineering And Ensemble Methods},
journal = {Scientific Reports},
year = {2026},
doi = {10.1038/s41598-026-37206-6},
url = {https://doi.org/10.1038/s41598-026-37206-6}
}
Original Source: https://doi.org/10.1038/s41598-026-37206-6