Journal of Language Modelling (Nov 2023)
We thought the eyes of coreference were shut to multiword expressions and they mostly are
Abstract
Multiword expressions are combinations of words that exhibit pecu-liar semantic properties, such as different degrees of non-compositio-nality, decomposability, transparency and figuration. Long-standing linguistic debates suggest that such semantic idiosyncrasy can con-dition the morpho-syntactic configurations in which a given multi-word expression can occur. Here, we extend this argumentation to a particular semantic and pragmatic phenomenon: nominal coreference. We hypothesise that the internal components of a multiword expres-sion are unlikely to occur in coreference chains. While previous work has identified the rareness of coreference-related phenomena in pres-ence of multiword expressions, this observation has never been quan-tified, to the best of our knowledge. We bridge this gap by performing an automated corpus-based study of the intersections between verbal multiword expressions and nominal coreference in French. The results largely corroborate our hypothesis but also display various tendencies depending on the type of multiword expression and the corpus genre. The analysis of the corpus examples highlights interesting properties of coreference, notably in speech.