Employing machine learning for advanced gap imputation in solar power generation databases

Tatiane Costa; Bruno Falcão; Mohamed A. Mohamed; Andres Annuk; Manoel Marinho

doi:10.1038/s41598-024-74342-3

Scientific Reports (Oct 2024)

Employing machine learning for advanced gap imputation in solar power generation databases

Tatiane Costa,
Bruno Falcão,
Mohamed A. Mohamed,
Andres Annuk,
Manoel Marinho

Affiliations

Tatiane Costa: Polytechnic School of Engineering (POLI-UPE), Postgraduate Program in Systems Engineering, University of Pernambuco (UPE)
Bruno Falcão: Polytechnic School of Engineering (POLI-UPE), Postgraduate Program in Systems Engineering, University of Pernambuco (UPE)
Mohamed A. Mohamed: Department of Electrical Engineering, Faculty of Engineering, Minia University
Andres Annuk: Institute of Forestry and Engineering, Estonian University of Life Sciences
Manoel Marinho: Polytechnic School of Engineering (POLI-UPE), Postgraduate Program in Systems Engineering, University of Pernambuco (UPE)

DOI: https://doi.org/10.1038/s41598-024-74342-3
Journal volume & issue: Vol. 14, no. 1
pp. 1 – 17

Abstract

Read online

Abstract This research evaluates the application of advanced machine learning algorithms, specifically Random Forest and Gradient Boosting, for the imputation of missing data in solar energy generation databases and their impact on the size of green hydrogen production systems. The study demonstrates that the Random Forest model notably excels in harnessing solar data to optimize hydrogen production, achieving superior prediction accuracy with mean absolute error (MAE) of 0.0364, mean squared error (MSE) of 0.0097, root mean squared error (RMSE) of 0.0985, and a coefficient of determination (R2) of 0.9779. These metrics surpass those obtained from baseline models including linear regression and recurrent neural networks, highlighting the potential of accurate imputation to significantly enhance the efficiency and output of renewable energy systems. The findings advocate for the integration of robust data imputation methods in the design and operation of photovoltaic systems, contributing to the reliability and sustainability of energy resource management. Furthermore, this research makes significant contributions by showcasing the comparative performance of traditional machine learning models in handling data gaps, emphasizing the practical implications of data imputation on optimizing hydrogen production systems. By providing a detailed analysis and validation of the imputation models, this work offers valuable insights for future advancements in renewable energy technology.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal

Abstract

Keywords