Machine Learning with Applications (Sep 2022)

Machine-learning models for spatially-explicit forecasting of future racial segregation in US cities

  • Tomasz F. Stepinski,
  • Anna Dmowska

Journal volume & issue
Vol. 9
p. 100359

Abstract

Read online

Residential racial segregation in large US cities is a complex phenomenon with important social, political, and economic ramifications. In this paper, we demonstrate that the prediction of future segregation can be achieved by using an empirical model generated by a machine learning (ML) algorithm. Specifically, we predict a future map of neighborhood types — racial compositions quantized to several archetypes. Within such a framework, the prediction of segregation is tantamount to the prediction of a thematic map of future neighborhood types. An ML model of change is trained on historical changes and used to make predictions. The key predicate of an ML model is the choice of attributes — variables that drive the change. We hypothesize that neighborhood type’s change of a spatial unit depends only on its present type and statistics of types in surrounding units. The paper asks and positively answers three questions. Is our hypothesis validated by the results? Does the proposed methodology yield useful predictions? Do our results agree with competing predictions? To answer these questions we train and validate a number of change models using, as the case study, 1990, 2000, 2010, and 2020 US Census Bureau block-level data for Cook County, IL (Chicago). We investigated four different algorithms, Random Forest, Gradient Boosted Trees, Neural Network, and Self-Normalizing Net, and have found that Gradient Boosted Trees (GBT) yields the best predictions. Using the GBT-generated model we make a prediction of residential segregation in Cook County in the year 2030.

Keywords