Humanities & Social Sciences Communications (Jun 2024)

Advanced modeling of housing locations in the city of Tehran using machine learning and data mining techniques

  • Ali Asghar Pilehvar,
  • Arian Ghasemi

DOI
https://doi.org/10.1057/s41599-024-03244-6
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 13

Abstract

Read online

Abstract This research delves into the intricate dynamics of housing location in the bustling metropolis of Tehran. It aims to gain a deeper understanding of the factors influencing housing prices across the city. Employing a descriptive-analytical method, the study utilizes the Python programming language and its libraries, along with various regression models, to analyze a comprehensive dataset of 8000 villas and apartments spread across 22 districts and 317 areas. Data obtained from official sources are used to examine the correlation between housing prices and nine key determinants. The findings reveal strong positive correlations between the total value of the houses and several factors: surface area (80%), neighborhood location (75%), presence of an elevator (44%), presence of a parking lot (43%), and year of construction (26%), these demonstrate the importance of area and neighborhood. Conversely, the distinct number shows an inverse correlation (−41%) which means the higher the distinct number is, the lower the total value will be. In its final stage, the study employs cross-validation to evaluate the performance of various learning models, achieving a maximum accuracy of 85%. The research concludes by presenting a new formulation and modeling approach for determining the total value of housing, showcasing its originality and contributions to the field.