Data (Apr 2022)
Classification of Building Types in Germany: A Data-Driven Modeling Approach
Abstract
Details on building levels play an essential part in a number of real-world application models. Energy systems, telecommunications, disaster management, the internet-of-things, health care, and marketing are a few of the many applications that require building information. The essential variables that most of these models require are building type, house type, area of living space, and number of residents. In order to acquire some of this information, this paper introduces a methodology and generates corresponding data. The study was conducted for specific applications in energy system modeling. Nonetheless, these data can also be used in other applications. Building locations and some of their details are openly available in the form of map data from OpenStreetMap (OSM). However, data regarding building types (i.e., residential, industrial, office, single-family house, multi-family house, etc.) are only partially available in the OSM dataset. Therefore, a machine learning classification algorithm for predicting the building types on the basis of the OSM buildings’ data was introduced. Although the OSM dataset is the fundamental and most crucial one used for modeling, the machine learning algorithm’s training was performed on a dataset that was prepared by combining several features from three other datasets. The generated dataset consists of approximately 29 million buildings, of which about 19 million are residential, with 72% being single-family houses and the rest multi-family ones that include two-family houses and apartment buildings. Furthermore, the results were validated through a comparison with publicly available statistical data. The comparison of the resulting data with official statistics reveals that there is a percentage error of 3.64% for residential buildings, 13.14% for single-family houses, and −15.38% for multi-family houses classification. Nevertheless, by incorporating the building types, this dataset is able to complement existing building information in studies in which building type information is crucial.
Keywords