Journal of Big Data (May 2018)

Forecasting AIDS prevalence in the United States using online search traffic data

  • Amaryllis Mavragani,
  • Gabriela Ochoa

DOI
https://doi.org/10.1186/s40537-018-0126-7
Journal volume & issue
Vol. 5, no. 1
pp. 1 – 21

Abstract

Read online

Abstract Over the past decade and with the increasing use of the Internet, the assessment of health issues using online search traffic data has become an integral part of Health Informatics. Internet data in general and from Google Trends in particular have been shown to be valid and valuable in predictions, forecastings, and nowcastings; and in detecting, tracking, and monitoring diseases’ outbreaks and epidemics. Empirical relationships have been shown to exist between Google Trends’ data and official data in several health topics, with the science of infodemiology using the vast amount of information available online for the assessment of public health and policy matters. The aim of this study is to provide a method of forecasting AIDS prevalence in the US using online search traffic data from Google Trends on AIDS related terms. The results at first show that significant correlations between Google Trends’ data and official health data on AIDS prevalence (2004–2015) exist in several States, while the estimated forecasting models for AIDS prevalence show that official health data and Google Trends data on AIDS follow a logarithmic relationship. Overall, the results of this study support previous work on the subject suggesting that Google data are valid and valuable for the analysis and forecasting of human behavior towards health topics, and could further assist with Health Assessment in the US and in other countries and regions with valid available official health data.

Keywords