Trials (Feb 2020)

Online randomized controlled experiments at scale: lessons and extensions to medicine

  • Ron Kohavi,
  • Diane Tang,
  • Ya Xu,
  • Lars G. Hemkens,
  • John P. A. Ioannidis

DOI
https://doi.org/10.1186/s13063-020-4084-y
Journal volume & issue
Vol. 21, no. 1
pp. 1 – 9

Abstract

Read online

Abstract Background Many technology companies, including Airbnb, Amazon, Booking.com, eBay, Facebook, Google, LinkedIn, Lyft, Microsoft, Netflix, Twitter, Uber, and Yahoo!/Oath, run online randomized controlled experiments at scale, namely hundreds of concurrent controlled experiments on millions of users each, commonly referred to as A/B tests. Originally derived from the same statistical roots, randomized controlled trials (RCTs) in medicine are now criticized for being expensive and difficult, while in technology, the marginal cost of such experiments is approaching zero and the value for data-driven decision-making is broadly recognized. Methods and results This is an overview of key scaling lessons learned in the technology field. They include (1) a focus on metrics, an overall evaluation criterion and thousands of metrics for insights and debugging, automatically computed for every experiment; (2) quick release cycles with automated ramp-up and shut-down that afford agile and safe experimentation, leading to consistent incremental progress over time; and (3) a culture of ‘test everything’ because most ideas fail and tiny changes sometimes show surprising outcomes worth millions of dollars annually. Technological advances, online interactions, and the availability of large-scale data allowed technology companies to take the science of RCTs and use them as online randomized controlled experiments at large scale with hundreds of such concurrent experiments running on any given day on a wide range of software products, be they web sites, mobile applications, or desktop applications. Rather than hindering innovation, these experiments enabled accelerated innovation with clear improvements to key metrics, including user experience and revenue. As healthcare increases interactions with patients utilizing these modern channels of web sites and digital health applications, many of the lessons apply. The most innovative technological field has recognized that systematic series of randomized trials with numerous failures of the most promising ideas leads to sustainable improvement. Conclusion While there are many differences between technology and medicine, it is worth considering whether and how similar designs can be applied via simple RCTs that focus on healthcare decision-making or service delivery. Changes – small and large – should undergo continuous and repeated evaluations in randomized trials and learning from their results will enable accelerated healthcare improvements.

Keywords