PLOS Digital Health (Jan 2022)

Best practices in the real-world data life cycle

  • Joe Zhang,
  • Joshua Symons,
  • Paul Agapow,
  • James T. Teo,
  • Claire A. Paxton,
  • Jordan Abdi,
  • Heather Mattie,
  • Charlie Davie,
  • Aracelis Z. Torres,
  • Amos Folarin,
  • Harpreet Sood,
  • Leo A. Celi,
  • John Halamka,
  • Sara Eapen,
  • Sanjay Budhdeo

Journal volume & issue
Vol. 1, no. 1

Abstract

Read online

With increasing digitization of healthcare, real-world data (RWD) are available in greater quantity and scope than ever before. Since the 2016 United States 21st Century Cures Act, innovations in the RWD life cycle have taken tremendous strides forward, largely driven by demand for regulatory-grade real-world evidence from the biopharmaceutical sector. However, use cases for RWD continue to grow in number, moving beyond drug development, to population health and direct clinical applications pertinent to payors, providers, and health systems. Effective RWD utilization requires disparate data sources to be turned into high-quality datasets. To harness the potential of RWD for emerging use cases, providers and organizations must accelerate life cycle improvements that support this process. We build on examples obtained from the academic literature and author experience of data curation practices across a diverse range of sectors to describe a standardized RWD life cycle containing key steps in production of useful data for analysis and insights. We delineate best practices that will add value to current data pipelines. Seven themes are highlighted that ensure sustainability and scalability for RWD life cycles: data standards adherence, tailored quality assurance, data entry incentivization, deploying natural language processing, data platform solutions, RWD governance, and ensuring equity and representation in data.