Discours (Dec 2012)
So to Speak: A Computational and Empirical Investigation of Lexical Cohesion of Non-Literal and Literal Expressions in Text
Abstract
Lexical cohesion is an important device for signaling text organization. In this paper, we investigate to what extent a particular class of expressions which can have a non-literal interpretation participates in the cohesive structure of a text. Specifically, we look at five expressions headed by a verb which – depending on the context – can have either a literal or a non-literal meaning: bounce off the wall (“to be excited and full of nervous energy”), get one’s feet wet (“to start a new activity or job”), rock the boat (“to disturb the balance or routine of a situation”), break the ice (“to start to get to know people, to overcome initial shyness”), and play with fire (“to take part in a dangerous or risky undertaking”). We look at the problem both from an empirical and a computational perspective. The results from our empirical study suggest that both literal and non-literal expressions exhibit cohesion with their textual context, but that the latter appear to do so to a lesser extent. We also show that an automatically computable semantic relatedness measure based on search engine page counts correlates well with human intuitions about the cohesive structure of a text and can therefore be used to determine the cohesive structure of a text automatically with a reasonable degree of accuracy. This investigation is undertaken from the perspective of computational linguistics. We aim both to model this cohesion computationally and to support our approach to computational modeling with empirical data.
Keywords