Journal of Clinical and Translational Science (Jan 2023)

Sociome Data Commons: A scalable and sustainable platform for investigating the full social context and determinants of health

  • Sandra Tilmon,
  • Sharmilee Nyenhuis,
  • Anthony Solomonides,
  • Bruno Barbarioli,
  • Ankur Bhargava,
  • Suzi Birz,
  • Kathryn Bouzein,
  • Celine Cardenas,
  • Bradley Carlson,
  • Ellen Cohen,
  • Emily Dillon,
  • Brian Furner,
  • Zhong Huang,
  • Julie Johnson,
  • Nivedha Krishnan,
  • Kevin Lazenby,
  • Kaitlyn Li,
  • Sonya Makhni,
  • Doriane Miller,
  • Jonathan Ozik,
  • Carlos Santos,
  • Marc Sleiman,
  • Julian Solway,
  • Sanjay Krishnan,
  • Samuel Volchenboum

DOI
https://doi.org/10.1017/cts.2023.670
Journal volume & issue
Vol. 7

Abstract

Read online

Abstract Background/Objective: Non-clinical aspects of life, such as social, environmental, behavioral, psychological, and economic factors, what we call the sociome, play significant roles in shaping patient health and health outcomes. This paper introduces the Sociome Data Commons (SDC), a new research platform that enables large-scale data analysis for investigating such factors. Methods: This platform focuses on “hyper-local” data, i.e., at the neighborhood or point level, a geospatial scale of data not adequately considered in existing tools and projects. We enumerate key insights gained regarding data quality standards, data governance, and organizational structure for long-term project sustainability. A pilot use case investigating sociome factors associated with asthma exacerbations in children residing on the South Side of Chicago used machine learning and six SDC datasets. Results: The pilot use case reveals one dominant spatial cluster for asthma exacerbations and important roles of housing conditions and cost, proximity to Superfund pollution sites, urban flooding, violent crime, lack of insurance, and a poverty index. Conclusion: The SDC has been purposefully designed to support and encourage extension of the platform into new data sets as well as the continued development, refinement, and adoption of standards for dataset quality, dataset inclusion, metadata annotation, and data access/governance. The asthma pilot has served as the first driver use case and demonstrates promise for future investigation into the sociome and clinical outcomes. Additional projects will be selected, in part for their ability to exercise and grow the capacity of the SDC to meet its ambitious goals.

Keywords