Journal of Universal Computer Science (Apr 2024)

Price Prediction and Determination of the Affecting Variables of the Real Estate by Using X-Means Clustering and CART Decision Trees

  • Sait Can Yucebas,
  • Sukran Yalpir,
  • Levent Genc,
  • Melike Dogan

DOI
https://doi.org/10.3897/jucs.98733
Journal volume & issue
Vol. 30, no. 4
pp. 531 – 560

Abstract

Read online Read online Read online

The use of machine learning in real estate is quite new. When the working area is large, the factors affecting the price may vary according to the geographical regions and socioeconomic factors. It is thought that the price prediction performance of a model that will reflect these differences will be more successful than a general model. Unsupervised learning methods can be used both to increase performance and to show the variation of different factors affecting the price according to regions. With this aim, a hybrid model of X-Means clustering and CART decision trees was established in this study.  This model successfully learned the geographical and physical variables that affect the price. The prediction performance of the model was compared with the direct capitalization method, which is the gold standard in the domain. The hybrid model has a superior performance over direct capitalization in terms of mean square error, root mean square error and adjusted R-Squared metrics. The scores were 72.86, 0.0057 and 0.978, respectively. The effect of clustering was also examined. Clustering increased the prediction performance by 36%. 

Keywords