Integrating Large Language Models in Political Discourse Studies on Social Media: Challenges of Validating an LLMs-in-the-loop Pipeline

Giada Marino; Fabio Giglietto

doi:10.6092/issn.1971-8853/19524

Sociologica (Oct 2024)

Integrating Large Language Models in Political Discourse Studies on Social Media: Challenges of Validating an LLMs-in-the-loop Pipeline

Giada Marino,
Fabio Giglietto

Affiliations

Giada Marino: ORCiD; Department of Communication Sciences, Humanities and International Studies, University of Urbino Carlo Bo
Fabio Giglietto: ORCiD; Department of Communication Sciences, Humanities and International Studies, University of Urbino Carlo Bo

DOI: https://doi.org/10.6092/issn.1971-8853/19524
Journal volume & issue: Vol. 18, no. 2
pp. 87 – 107

Abstract

Read online

The integration of Large Language Models (LLMs) into research workflows has the potential to transform the study of political content on social media. This essay discusses a validation protocol addressing three key aspects of LLM-integrated research: the versatility of LLMs as general-purpose models, the granularity and nuance in LLM-uncovered narratives, and the limitations of human assessment capabilities. The protocol includes phases for fine-tuning and validating a binary political classifier, evaluating cluster coherence, and assessing machine-generated cluster label accuracy. We applied this protocol to validate an LLMs-in-the-loop research pipeline designed to analyze political content on Facebook during the Italian general elections of 2018 and 2022. Our approach classifies political links, clusters them by similarity, and generates descriptive labels for clusters. This methodology presents unique validation challenges, prompting a reevaluation of accuracy assessment strategies. By sharing our experiences, this essay aims to guide social scientists in employing LLM-based methodologies, highlighting challenges and advancing recommendations for colleagues intending to integrate these tools for political content analysis on social media.

Published in Sociologica

ISSN: 1971-8853 (Online)
Publisher: University of Bologna
Country of publisher: Italy
LCC subjects: Social Sciences: Sociology (General)
Website: https://sociologica.unibo.it/

About the journal

Abstract

Keywords