Dictionary-based and machine learning classification approaches: a comparison for tonality and frame detection on Twitter data

Maud Reveilhac; Davide Morselli

doi:10.1080/2474736X.2022.2029217

Political Research Exchange (Dec 2022)

Dictionary-based and machine learning classification approaches: a comparison for tonality and frame detection on Twitter data

Maud Reveilhac,
Davide Morselli

Affiliations

Maud Reveilhac: Institute of Social Sciences, Life Course and Social Inequality Research Centre, Lausanne University
Davide Morselli: Institute of Social Sciences, Life Course and Social Inequality Research Centre, Lausanne University

DOI: https://doi.org/10.1080/2474736X.2022.2029217
Journal volume & issue: Vol. 4, no. 1

Abstract

Read online

Automated text analysis methods have made it possible to classify large corpora of text by measures such as frames and tonality, with a growing popularity in social, political and psychological science. These methods often demand a training dataset of sufficient size to generate accurate models that can be applied to unseen texts. In practice, however, there are no clear recommendations about how big the training samples should be. This issue becomes especially acute when dealing with texts skewed toward categories and when researchers cannot afford large samples of annotated texts. Leveraging on the case of support for democracy, we provide a guide to help researchers navigate decisions when producing measures of tonality and frames from a small sample of annotated social media posts. We find that supervised machine learning algorithms outperform dictionaries for tonality classification tasks. However, custom dictionaries are useful complements of these algorithms when identifying latent democracy dimensions in social media messages, especially as the method of elaborating these dictionaries is guided by word embedding techniques and human validation. Therefore, we provide easily implementable recommendations to increase estimation accuracy under non-optimal condition.

Published in Political Research Exchange

ISSN: 2474-736X (Online)
Publisher: Taylor & Francis Group
Country of publisher: United Kingdom
LCC subjects: Political science
Website: https://www.tandfonline.com/journals/prxx

About the journal

Abstract

Keywords