NLP4NLP+5: The Deep (R)evolution in Speech and Language Processing

Joseph Mariani; Gil Francopoulo; Patrick Paroubek; Frédéric Vernier

doi:10.3389/frma.2022.863126

Frontiers in Research Metrics and Analytics (Jul 2022)

NLP4NLP+5: The Deep (R)evolution in Speech and Language Processing

Joseph Mariani,
Gil Francopoulo,
Patrick Paroubek,
Frédéric Vernier

Affiliations

Joseph Mariani: Université Paris-Saclay, CNRS, Laboratoire Interdisciplinaire des Sciences du Numérique, Orsay, France
Gil Francopoulo: Tagmatica, Paris, France
Patrick Paroubek: Université Paris-Saclay, CNRS, Laboratoire Interdisciplinaire des Sciences du Numérique, Orsay, France
Frédéric Vernier: Université Paris-Saclay, CNRS, Laboratoire Interdisciplinaire des Sciences du Numérique, Orsay, France

DOI: https://doi.org/10.3389/frma.2022.863126
Journal volume & issue: Vol. 7

Abstract

Read online

This paper aims at analyzing the changes in the fields of speech and natural language processing over the recent past 5 years (2016–2020). It is in continuation of a series of two papers that we published in 2019 on the analysis of the NLP4NLP corpus, which contained articles published in 34 major conferences and journals in the field of speech and natural language processing, over a period of 50 years (1965–2015), and analyzed with the methods developed in the field of NLP, hence its name. The extended NLP4NLP+5 corpus now covers 55 years, comprising close to 90,000 documents [+30% compared with NLP4NLP: as many articles have been published in the single year 2020 than over the first 25 years (1965–1989)], 67,000 authors (+40%), 590,000 references (+80%), and approximately 380 million words (+40%). These analyses are conducted globally or comparatively among sources and also with the general scientific literature, with a focus on the past 5 years. It concludes in identifying profound changes in research topics as well as in the emergence of a new generation of authors and the appearance of new publications around artificial intelligence, neural networks, machine learning, and word embedding.

Published in Frontiers in Research Metrics and Analytics

ISSN: 2504-0537 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Bibliography. Library science. Information resources
Website: http://journal.frontiersin.org/journal/research-metrics-and-analytics

About the journal

Abstract

Keywords