Humanities & Social Sciences Communications (Nov 2021)

Machine learning methods for “wicked” problems: exploring the complex drivers of modern slavery

  • Rosa Lavelle-Hill,
  • Gavin Smith,
  • Anjali Mazumder,
  • Todd Landman,
  • James Goulding

DOI
https://doi.org/10.1057/s41599-021-00938-z
Journal volume & issue
Vol. 8, no. 1
pp. 1 – 11

Abstract

Read online

Abstract Forty million people are estimated to be in some form of modern slavery across the globe. Understanding the factors that make any particular individual or geographical region vulnerable to such abuse is essential for the development of effective interventions and policy. Efforts to isolate and assess the importance of individual drivers statistically are impeded by two key challenges: data scarcity and high dimensionality, typical of many “wicked problems”. The hidden nature of modern slavery restricts available data points; and the large number of candidate variables that are potentially predictive of slavery inflate the feature space exponentially. The result is a “small n, large p” setting, where overfitting and significant inter-correlation of explanatory variables can render more traditional statistical approaches problematic. Recent advances in non-parametric computational methods, however, offer scope to overcome such challenges and better capture the complex nature of modern slavery. We present an approach that combines non-linear machine-learning models and strict cross-validation methods with novel variable importance techniques, emphasising the importance of stability of model explanations via a Rashomon-set analysis. This approach is used to model the prevalence of slavery in 48 countries, with results bringing to light the importance of new predictive factors—such as a country’s capacity to protect the physical security of women, which has been previously under-emphasised in quantitative models. Further analyses uncover that women are particularly vulnerable to exploitation in areas where there is poor access to resources. Our model was then leveraged to produce new out-of-sample estimates of slavery prevalence for countries where no survey data currently exists.