Data (Nov 2022)
Identifying and Classifying Urban Data Sources for Machine Learning-Based Sustainable Urban Planning and Decision Support Systems Development
Abstract
With the increase in the amount and variety of data that are constantly produced, collected, and exchanged between systems, the efficiency and accuracy of solutions/services that use data as input may suffer if an inappropriate or inaccurate technique, method, or tool is chosen to deal with them. This paper presents a global overview of urban data sources and structures used to train machine learning (ML) algorithms integrated into urban planning decision support systems (DSS). It contributes to a common understanding of choosing the right urban data for a given urban planning issue, i.e., their type, source and structure, for more efficient use in training ML models. For the purpose of this study, we conduct a systematic literature review (SLR) of all relevant peer-reviewed studies available in the Scopus database. More precisely, 248 papers were found to be relevant with their further analysis using a text-mining approach to determine (a) the main urban data sources used for ML modeling, (b) the most popular approaches used in relevant urban planning and urban problem-solving studies and their relationship to the type of data source used, and (c) the problems commonly encountered in their use. After classifying them, we identified the strengths and weaknesses of data sources depending on several predefined factors. We found that the data mainly come from two main categories of sources, namely (1) sensors and (2) statistical surveys, including social network data. They can be classified as (a) opportunistic or (b) non-opportunistic depending on the process of data acquisition, collection, and storage. Data sources are closely correlated with their structure and potential urban planning issues to be addressed. Almost all urban data have an indexed structure and, in particular, either attribute tables for statistical survey data and data from simple sensors (e.g., climate and pollution sensors) or vectors, mostly obtained from satellite images after large-scale spatio-temporal analysis. The paper also provides a discussion of the potential opportunities, emerging issues, and challenges that urban data sources face and should overcome to better catalyze intelligent/smart planning. This should contribute to the general understanding of the data, their sources and the challenges to be faced and overcome by those seeking data and integrating them into smart applications and urban-planning processes.
Keywords