Position Information in Transformers: An Overview

Philipp Dufter; Martin Schmitt; Hinrich Schütze

doi:10.1162/coli_a_00445

Computational Linguistics (Jun 2022)

Position Information in Transformers: An Overview

Philipp Dufter,
Martin Schmitt,
Hinrich Schütze

Affiliations

Philipp Dufter
Martin Schmitt
Hinrich Schütze

DOI: https://doi.org/10.1162/coli_a_00445
Journal volume & issue: Vol. 48, no. 3

Abstract

Read online

Transformers are arguably the main workhorse in recent natural language processing research. By definition, a Transformer is invariant with respect to reordering of the input. However, language is inherently sequential and word order is essential to the semantics and syntax of an utterance. In this article, we provide an overview and theoretical comparison of existing methods to incorporate position information into Transformer models. The objectives of this survey are to (1) showcase that position information in Transformer is a vibrant and extensive research area; (2) enable the reader to compare existing methods by providing a unified notation and systematization of different approaches along important model dimensions; (3) indicate what characteristics of an application should be taken into account when selecting a position encoding; and (4) provide stimuli for future research.

Published in Computational Linguistics

ISSN: 0891-2017 (Print); 1530-9312 (Online)
Publisher: The MIT Press
Country of publisher: United States
LCC subjects: Language and Literature: Philology. Linguistics: Computational linguistics. Natural language processing
Website: https://direct.mit.edu/coli

About the journal