Journal of Transport and Land Use (May 2024)
Sydney’s residential relocation landscape: Machine learning and feature selection methods unpack the whys and whens
Abstract
This study investigates household residential relocation timing, an aspect vital for transport and urban planning. Analyzing a high-dimensional dataset from 1,024 relocations in Sydney, Australia, the research contrasts ten machine learning survival techniques with three classical survival models. Results indicate that when classical models are paired with tree-based automated feature selectors, they align closely with machine learning outcomes. Notably, the GBM, XGBoost, and Random Forest models emerge as standout performers. The study provides a comprehensive comparison between automatic and manual feature selection, shedding light on variables influencing households’ duration of stay. While stacked ensemble modeling, which leverages predictions from various models, is used to enhance accuracy, the improvements are marginal, underscoring inherent modeling challenges, particularly the recurring issue of misclassifying specific pairs of households in the concordance index measure. A thorough feature analysis highlights homeownership as the foremost predictor, underscoring the importance of recent life events and accessibility features in relocation decisions. The research emphasizes the importance of considering the accessibility of both current and future homes in relocation models, with 20% feature significance in model outcomes. Building on these foundational insights, the study paves the way for a deeper understanding of individual decision-making processes in sustainable urban planning.
Keywords