Frontiers in Earth Science (May 2020)
Analysis of Online News Coverage on Earthquakes Through Text Mining
Abstract
News agencies work around the clock to report critical news such as earthquakes. We investigate the relationship between online news articles and seismic events that happen around the world in real time. We utilize computer text mining tools to automatically harvest, identify, cluster and extract information from earthquake-related reports, and carry out cross-validation on the mined information. Earthquake parameters retrieved from the United States Geological Survey (USGS) Application Programming Interface (API) are organized into earthquake events, with each event consisting of daily earthquake readings taking place in a particular geographical location. The results are then visualized on a user-friendly dashboard. 268,182 news reports published by 23 news agencies from different parts of the world and 14,717 earthquakes of magnitude ranging from 4 to 8.2 listed in the bulletin were processed during a 1-year study between 2018 and 2019. 1.25% of the analyzed articles had the word “quake” and 0.4% were clustered and then mapped to an earthquake event. The use of multilingual news sources from 16 countries (6 languages) gives the advantage of reducing potential news bias originating from English-written reports only. The mapping of articles with an earthquake catalog helps verify earthquake reports and determine relationships. We find that the distribution of the reported seismicity is from earthquakes that occur on or very close to land. We propose a general relationship between the number of news agencies, the earthquake magnitude and the anticipated number of published articles. News reports tend to mention higher earthquake magnitudes than those in the USGS earthquake catalog, and the reports on earthquakes can last from a few days to a couple of weeks following the earthquake.
Keywords