Scientific Reports (Feb 2021)
Infectious disease outbreak prediction using media articles with machine learning models
Abstract
Abstract When a newly emerging infectious disease breaks out in a country, it brings critical damage to both human health conditions and the national economy. For this reason, apprehending which disease will newly emerge, and preparing countermeasures for that disease, are required. Many different types of infectious diseases are emerging and threatening global human health conditions. For this reason, the detection of emerging infectious disease pattern is critical. However, as the epidemic spread of infectious disease occurs sporadically and rapidly, it is not easy to predict whether an infectious disease will emerge or not. Furthermore, accumulating data related to a specific infectious disease is not easy. For these reasons, finding useful data and building a prediction model with these data is required. The Internet press releases numerous articles every day that rapidly reflect currently pending issues. Thus, in this research, we accumulated Internet articles from Medisys that were related to infectious disease, to see if news data could be used to predict infectious disease outbreak. Articles related to infectious disease from January to December 2019 were collected. In this study, we evaluated if newly emerging infectious diseases could be detected using the news article data. Support Vector Machine (SVM), Semi-supervised Learning (SSL), and Deep Neural Network (DNN) were used for prediction to examine the use of information embedded in the web articles: and to detect the pattern of emerging infectious disease.