Applied Network Science (Aug 2018)

An illustrated approach to Soft Textual Cartography

  • Raphaël Ceré,
  • Mattia Egloff

DOI
https://doi.org/10.1007/s41109-018-0087-y
Journal volume & issue
Vol. 3, no. 1
pp. 1 – 17

Abstract

Read online

Abstract We propose and illustrate an approach of Soft Textual Cartography consisting in the clustering of regions by taking into account both their spatial relationships and their textual description within a corpus. We reduce large geo-referenced textual content into topics and merge them with their spatial configuration to reveal spatial patterns. The strategy consists in constructing a complex weighted network, reflecting the geographical layout, and whose nodes are further characterised by their thematic dissimilarity, extracted form topic modelling. A soft k-means procedure, taking into account both aspects through expectation maximisation on Gaussian mixture models and label propagation, converges towards a soft membership, to be further compared with expert knowledge on regions. Application on the Wikipedia pages of Swiss municipalities demonstrate the potential of the approach, revealing textual autocorrelation and associations with official classifications. The synergy of the spatial and textual aspects appears promising in topic interpretation and geographical information retrieval, and able to incorporate expert knowledge through the choice of the initial membership.

Keywords