IEEE Access (Jan 2024)

Forecasting and Performance Analysis of Energy Production in Solar Power Plants Using Long Short-Term Memory (LSTM) and Random Forest Models

  • Kadir Olcay,
  • Samet Giray Tunca,
  • Mustafa Arif Ozgur

DOI
https://doi.org/10.1109/ACCESS.2024.3432574
Journal volume & issue
Vol. 12
pp. 103299 – 103312

Abstract

Read online

The rapid increase in energy demand and the disadvantages of using fossil fuels in electricity production have led to a greater emphasis on renewable energy sources. Consequently, research on the use of renewable resources has gained importance. Numerous factors influence the energy production of power plants that generate electricity from these sources. Power plants utilizing solar energy, one of the renewable energy sources, are significantly affected by environmental factors and meteorological variables, impacting the continuity of electrical energy production in solar power plants (SPPs). For these reasons, this study developed prediction models using two different methods based on machine learning and artificial intelligence to analyze and predict changes in the electrical energy production of SPPs due to environmental factors and meteorological changes. The data used in the study are real data collected from a 180 kWe solar power plant currently in operation. Data collection started from the day the power plant was commissioned. Using real data, the effects of pollution and environmental impacts on PV panels’ energy production are demonstrated. To mitigate these effects and examine the impact of adverse conditions on production efficiency, two different analysis methods were used: Random Forest Regression (RFR) Model and Artificial Neural Networks (ANN). This allowed for a comparison of the results between the models. Long Short-Term Memory (LSTM) networks, a type of artificial neural network, were utilized. A prediction model was created for the decrease in energy production of the power plant due to pollution and environmental impacts using Random Forest (RF) regression analysis, which analyzes energy production based on non-linear independent input variables and creates a prediction model. The model estimated SPP’s electrical energy production based on environmental impact measurements and pollution. A graph comparing estimated energy production amounts with actual values is shown. In another analysis phase, neural networks were trained with data from the SPP and measurement station using Long Short-Term Memory (LSTM) networks. The energy production of the power plant was estimated with the trained LSTM neural networks, and the results are shown graphically. A very large data set was used in the two different prediction models. The training data set includes hourly sunshine duration, accumulated irradiation (Wh/m2), hourly maximum temperature, hourly minimum temperature, humidity (%), hourly temperature, hourly total precipitation (kg/m2), and daily and hourly data since the power plant began operation. It consists of data on wind speed (m/s), pollution, and energy production values of the power plant. This means a total of 119,808 data points were processed in the prediction model, highlighting the detail of the analysis. The results were evaluated using four different performance measures: correlation coefficient (R), fractional gross error (FGE), mean standard error (MBE), and root mean square error (RMSE). The RF results showed a correlation coefficient of 0.8111 with the predictions. In contrast, the LSTM network predictions had an R value of 0.9759. Comparing RFR and LSTM, it is evident that LSTM provides much better results in models created with the entire data set.

Keywords