Cybergeo (Jun 2021)
Trente millions d'adresses : boulevards et impasses d'une recherche sur la complexité lexicale et spatiale des odonymes
Abstract
Odonyms, or street names, have a significant linguistic complexity in France. They usually have at least two parts, one "generic" (street, place, avenue…) with a general scope and the other, "specific", which associates a name to the first. Decomposed in such a way, odonyms carry the traces of an underlying geography whose structure remains to be studied. In France, there are approximately 28.5 million geolocalized addresses in 2018, listed in the National Addresses Database (BAN). From these addresses, we have extracted the 2.33 million odonyms in the municipalities of France. An innovative methodology using Natural Language Processing paired with a custom-built ontology of the 448 generics allowed us to perform a Part-of-Speech labelling and splitting of the odonyms into its constituants stored into a coherent and homogeneous geolocalized database. A cartographic study of the distribution of generics in the French space at departemental level was then carried out. Against all expectations, their distribution turns out to be non-homogeneous in the French space, even for some very common names such as "rue".
Keywords