Statistical inferences for polarity identification in natural language.

Nicolas Pröllochs; Stefan Feuerriegel; Dirk Neumann

doi:10.1371/journal.pone.0209323

PLoS ONE (Jan 2018)

Statistical inferences for polarity identification in natural language.

Nicolas Pröllochs,
Stefan Feuerriegel,
Dirk Neumann

Affiliations

Nicolas Pröllochs
Stefan Feuerriegel
Dirk Neumann

DOI: https://doi.org/10.1371/journal.pone.0209323
Journal volume & issue: Vol. 13, no. 12
p. e0209323

Abstract

Read online

Information forms the basis for all human behavior, including the ubiquitous decision-making that people constantly perform in their every day lives. It is thus the mission of researchers to understand how humans process information to reach decisions. In order to facilitate this task, this work proposes LASSO regularization as a statistical tool to extract decisive words from textual content in order to study the reception of granular expressions in natural language. This differs from the usual use of the LASSO as a predictive model and, instead, yields highly interpretable statistical inferences between the occurrences of words and an outcome variable. Accordingly, the method suggests direct implications for the social sciences: it serves as a statistical procedure for generating domain-specific dictionaries as opposed to frequently employed heuristics. In addition, researchers can now identify text segments and word choices that are statistically decisive to authors or readers and, based on this knowledge, test hypotheses from behavioral research.

Published in PLoS ONE

ISSN: 1932-6203 (Online)
Publisher: Public Library of Science (PLoS)
Country of publisher: United States
LCC subjects: Medicine; Science
Website: https://journals.plos.org/plosone/

About the journal