Energies (Jul 2024)

Addressing Data Scarcity in Solar Energy Prediction with Machine Learning and Augmentation Techniques

  • Aleksandr Gevorgian,
  • Giovanni Pernigotto,
  • Andrea Gasparella

DOI
https://doi.org/10.3390/en17143365
Journal volume & issue
Vol. 17, no. 14
p. 3365

Abstract

Read online

The accurate prediction of global horizontal irradiance (GHI) is crucial for optimizing solar power generation systems, particularly in mountainous areas with complex topography and unique microclimates. These regions face significant challenges due to limited reliable data and the dynamic nature of local weather conditions, which complicate accurate GHI measurement. The scarcity of precise data impedes the development of reliable solar energy prediction models, impacting both economic and environmental outcomes. To address these data scarcity challenges in solar energy prediction, this paper focuses on various locations in Europe and Asia Minor, predominantly in mountainous regions. Advanced machine learning techniques, including random forest (RF) and extreme gradient boosting (XGBoost) regressors, are employed to effectively predict GHI. Additionally, optimizing training data distribution based on cloud opacity values and integrating synthetic data significantly enhance predictive accuracy, with R2 scores ranging from 0.91 to 0.97 across multiple locations. Furthermore, substantial reductions in root mean square error (RMSE), mean absolute error (MAE), and mean bias error (MBE) underscore the improved reliability of the predictions. Future research should refine synthetic data generation, optimize additional meteorological and environmental parameter integration, extend methodology to new regions, and test for predicting global tilted irradiance (GTI). The studies should expand training data considerations beyond cloud opacity, incorporating sky cover and sunshine duration to enhance prediction accuracy and reliability.

Keywords