PLoS ONE (Jan 2020)
Understanding the spatial dimension of natural language by measuring the spatial semantic similarity of words through a scalable geospatial context window.
Abstract
Measuring the semantic similarity between words is important for natural language processing tasks. The traditional models of semantic similarity perform well in most cases, but when dealing with words that involve geographical context, spatial semantics of implied spatial information are rarely preserved. Geographic information retrieval (GIR) methods have focused on this issue; however, they sometimes fail to solve the problem because the spatial and textual similarities of words are considered and calculated separately. In this paper, from the perspective of spatial context, we consider the two parts as a whole-spatial context semantics, and we propose a method that measures spatial semantic similarity using a sliding geospatial context window for geo-tagged words. The proposed method was first validated with a set of simulated data and then applied to a real-world dataset from Flickr. As a result, a spatial semantic similarity model at different scales is presented. We believe this model is a necessary supplement for traditional textual-language semantic analyses of words obtained by word-embedding technologies. This study has the potential to improve the quality of recommendation systems by considering relevant spatial context semantics, and benefits linguistic semantic research by emphasising the spatial cognition among words.