Energies (Sep 2021)

Machine Learning and Data Segmentation for Building Energy Use Prediction—A Comparative Study

  • William Mounter,
  • Chris Ogwumike,
  • Huda Dawood,
  • Nashwan Dawood

DOI
https://doi.org/10.3390/en14185947
Journal volume & issue
Vol. 14, no. 18
p. 5947

Abstract

Read online

Advances in metering technologies and emerging energy forecast strategies provide opportunities and challenges for predicting both short and long-term building energy usage. Machine learning is an important energy prediction technique, and is significantly gaining research attention. The use of different machine learning techniques based on a rolling-horizon framework can help to reduce the prediction error over time. Due to the significant increases in error beyond short-term energy forecasts, most reported energy forecasts based on statistical and machine learning techniques are within the range of one week. The aim of this study was to investigate how facility managers can improve the accuracy of their building’s long-term energy forecasts. This paper presents an extensive study of machine learning and data processing techniques and how they can more accurately predict within different forecast ranges. The Clarendon building of Teesside University was selected as a case study to demonstrate the prediction of overall energy usage with different machine learning techniques such as polynomial regression (PR), support vector regression (SVR) and artificial neural networks (ANNs). This study further examined how preprocessing training data for prediction models can impact the overall accuracy, such as via segmenting the training data by building modes (active and dormant), or by days of the week (weekdays and weekends). The results presented in this paper illustrate a significant reduction in the mean absolute percentage error (MAPE) for segmented building (weekday and weekend) energy usage prediction when compared to unsegmented monthly predictions. A reduction in MAPE of 5.27%, 11.45%, and 12.03% was achieved with PR, SVR and ANN, respectively.

Keywords