Active learning concerning sampling cost for enhancing AI-enabled building energy system modeling

Ao Li; Fu Xiao; Ziwei Xiao; Rui Yan; Anbang Li; Yan Lv; Bing Su

Advances in Applied Energy (Dec 2024)

Active learning concerning sampling cost for enhancing AI-enabled building energy system modeling

Ao Li,
Fu Xiao,
Ziwei Xiao,
Rui Yan,
Anbang Li,
Yan Lv,
Bing Su

Affiliations

Ao Li: Department of Building Environment and Energy Engineering, The Hong Kong Polytechnic University, Hong Kong, China
Fu Xiao: Department of Building Environment and Energy Engineering, The Hong Kong Polytechnic University, Hong Kong, China; Research Institute for Smart Energy, The Hong Kong Polytechnic University, Hong Kong, China; Corresponding author.
Ziwei Xiao: Department of Building Environment and Energy Engineering, The Hong Kong Polytechnic University, Hong Kong, China
Rui Yan: Midea Building Technologies Division, Midea Group, Guangdong 528311, China
Anbang Li: Midea Building Technologies Division, Midea Group, Guangdong 528311, China
Yan Lv: Midea Building Technologies Division, Midea Group, Guangdong 528311, China
Bing Su: Midea Building Technologies Division, Midea Group, Guangdong 528311, China

Journal volume & issue: Vol. 16
p. 100189

Abstract

Read online

Machine learning is widely recognized as a promising data-driven modeling technique for the model-based control and optimization of building energy systems. However, the generalizability of data-driven models often faces significant challenges, as the available training data from building operations usually only covers a limited range of working conditions. Active learning can proactively test unseen and informative working conditions to enrich the training set by adding new data samples, leading to improved generalization performance of data-driven models. A novel distance and information density-based sample strategy is developed that accounts for the real-time status of building operation and outdoor environment. Based on Mahalanobis distance, this strategy determines the sampling value of an unlabeled sample (unseen working condition) by assessing its similarity to both the training samples and other unlabeled samples. As collecting sufficiently representative samples can be difficult, costly, and time-consuming, a distance-based sampling cost metric is proposed to compare the efficiency of different sampling methods, considering the detrimental effects of the actively sampling process on the normal operation of building energy systems. This paper presents a comprehensive and in-depth comparison of five active learning methods, including one incorporating the distance-based sampling strategy, by conducting data experiments on the data collected from the cooling towers of a real high-rise building. The results show that active learning can effectively identify informative data samples and improve the generalization performance of data-driven models. The research outcomes are valuable for enhancing AI-enabled data-driven modeling of building energy systems with substantial decreases in costs on data sampling.

Published in Advances in Applied Energy

ISSN: 2666-7924 (Online)
Publisher: Elsevier
Country of publisher: United Kingdom
LCC subjects: Social Sciences: Industries. Land use. Labor: Special industries and trades: Energy industries. Energy policy. Fuel trade
Website: https://www.journals.elsevier.com/advances-in-applied-energy

About the journal

Abstract

Keywords