Dataset on dynamics of Coronavirus on Twitter

Norman Aguilar-Gallegos; Leticia Elizabeth Romero-García; Enrique Genaro Martínez-González; Edgar Iván García-Sánchez; Jorge Aguilar-Ávila

Data in Brief (Jun 2020)

Dataset on dynamics of Coronavirus on Twitter

Norman Aguilar-Gallegos,
Leticia Elizabeth Romero-García,
Enrique Genaro Martínez-González,
Edgar Iván García-Sánchez,
Jorge Aguilar-Ávila

Affiliations

Norman Aguilar-Gallegos: Centro de Investigaciones Económicas, Sociales y Tecnológicas de la Agroindustria y la Agricultura Mundial (CIESTAAM), Universidad Autónoma Chapingo (UACh), Chapingo, Estado de México, México
Leticia Elizabeth Romero-García: Universidad Autónoma del Estado de México (UAEM), Estado de México, México; Corresponding author.
Enrique Genaro Martínez-González: Centro de Investigaciones Económicas, Sociales y Tecnológicas de la Agroindustria y la Agricultura Mundial (CIESTAAM), Universidad Autónoma Chapingo (UACh), Chapingo, Estado de México, México
Edgar Iván García-Sánchez: Centro de Investigaciones Interdisciplinarias sobre Desarrollo Regional (CIISDER), Universidad Autónoma de Tlaxcala (UATx), Tlaxcala, México
Jorge Aguilar-Ávila: Centro de Investigaciones Económicas, Sociales y Tecnológicas de la Agroindustria y la Agricultura Mundial (CIESTAAM), Universidad Autónoma Chapingo (UACh), Chapingo, Estado de México, México

Journal volume & issue: Vol. 30
p. 105684

Abstract

Read online

In this data article, we provide a dataset of 8,982,694 Twitter posts around the coronavirus health global crisis. The data were collected through the Twitter REST API search. We used the rtweet R package to download raw data. The term searched was “Coronavirus” which included the word itself and its hashtag version. We collected the data over 23 days, from January 21 to February 12, 2020. The dataset is multilingual, prevailing English, Spanish, and Portuguese. We include a new variable created from other four variables; it is called “type” of tweets, which is useful for showing the diversity of tweets and the dynamics of users on Twitter. The dataset comprises seven databases which can be analysed separately. On the other hand, they can be crossed to set other researches, among them, trends and relevance of different topics, types of tweets, the embeddedness of users and their profiles, the retweets dynamics, hashtag analysis, as well as to perform social network analysis. This dataset can attract the attention of researchers related to different fields on knowledge, such as data science, social science, network science, health informatics, tourism, infodemiology, and others.

Published in Data in Brief

ISSN: 2352-3409 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Science (General)
Website: http://www.journals.elsevier.com/data-in-brief/

About the journal

Abstract

Keywords