International Journal of Digital Earth (Sep 2018)

Towards intelligent geospatial data discovery: a machine learning framework for search ranking

  • Yongyao Jiang,
  • Yun Li,
  • Chaowei Yang,
  • Fei Hu,
  • Edward M. Armstrong,
  • Thomas Huang,
  • David Moroni,
  • Lewis J. McGibbney,
  • Christopher J. Finch

DOI
https://doi.org/10.1080/17538947.2017.1371255
Journal volume & issue
Vol. 11, no. 9
pp. 956 – 971

Abstract

Read online

Current search engines in most geospatial data portals tend to induce users to focus on one single-data characteristic dimension (e.g. popularity and release date). This approach largely fails to take account of users’ multidimensional preferences for geospatial data, and hence may likely result in a less than optimal user experience in discovering the most applicable dataset. This study reports a machine learning framework to address the ranking challenge, the fundamental obstacle in geospatial data discovery, by (1) identifying a number of ranking features of geospatial data to represent users’ multidimensional preferences by considering semantics, user behavior, spatial similarity, and static dataset metadata attributes; (2) applying a machine learning method to automatically learn a ranking function; and (3) proposing a system architecture to combine existing search-oriented open source software, semantic knowledge base, ranking feature extraction, and machine learning algorithm. Results show that the machine learning approach outperforms other methods, in terms of both precision at K and normalized discounted cumulative gain. As an early attempt of utilizing machine learning to improve the search ranking in the geospatial domain, we expect this work to set an example for further research and open the door towards intelligent geospatial data discovery.

Keywords