Frontiers in Public Health (Dec 2016)

The Problem with Big Data: Operating on Smaller Datasets to Bridge the Implementation Gap

  • Richard Mann,
  • Faisal Mushtaq,
  • Alan White,
  • Gabriel Cervantes,
  • Tom Pike,
  • Tom Pike,
  • Dalton Coker,
  • Stuart Murdoch,
  • Tim Hiles,
  • Clare Smith,
  • David Berridge,
  • Geoff Hall,
  • Suzanne Hinchliffe,
  • Stephen Smye,
  • Stephen Smye,
  • Richard McGilchrist Wilkie,
  • Peter Lodge,
  • Peter Lodge,
  • Mark Mon-Williams

DOI
https://doi.org/10.3389/fpubh.2016.00248
Journal volume & issue
Vol. 4

Abstract

Read online

Big datasets have the potential to revolutionize public health. However, there is a mismatch between the political and scientific optimism surrounding big data and the public’s perception of its benefit. We suggest a systematic and concerted emphasis on developing models derived from smaller datasets to illustrate to the public how big data can produce tangible benefits in the long-term. In order to highlight the immediate value of a small data approach, we produced a proof-of-concept model predicting hospital length of stay. The results demonstrate that existing small datasets can be used to create models that generate a reasonable prediction, facilitating healthcare delivery. We propose that greater attention (and funding) needs to be directed toward the utilization of existing information resources in parallel with current efforts to create and exploit ‘big data’.

Keywords