Text GCN-SW-KNN: a novel collaborative training multi-label classification method for WMS application themes by considering geographic semantics

Zhengyang Wei; Zhipeng Gui; Min Zhang; Zelong Yang; Yuao Mei; Huayi Wu; Hongbo Liu; Jing Yu

doi:10.1080/20964471.2021.1877434

Big Earth Data (Jan 2021)

Text GCN-SW-KNN: a novel collaborative training multi-label classification method for WMS application themes by considering geographic semantics

Zhengyang Wei,
Zhipeng Gui,
Min Zhang,
Zelong Yang,
Yuao Mei,
Huayi Wu,
Hongbo Liu,
Jing Yu

Affiliations

Zhengyang Wei: Wuhan University
Zhipeng Gui: Wuhan University
Min Zhang: Wuhan University
Zelong Yang: Wuhan University
Yuao Mei: Wuhan University
Huayi Wu: Wuhan University
Hongbo Liu: Chongqing Geomatics and Remote Sensing Center
Jing Yu: Chongqing Geomatics and Remote Sensing Center

DOI: https://doi.org/10.1080/20964471.2021.1877434
Journal volume & issue: Vol. 5, no. 1
pp. 66 – 89

Abstract

Read online

Without explicit description of map application themes, it is difficult for users to discover desired map resources from massive online Web Map Services (WMS). However, metadata-based map application theme extraction is a challenging multi-label text classification task due to limited training samples, mixed vocabularies, variable length and content arbitrariness of text fields. In this paper, we propose a novel multi-label text classification method, Text GCN-SW-KNN, based on geographic semantics and collaborative training to improve classification accuracy. The semi-supervised collaborative training adopts two base models, i.e. a modified Text Graph Convolutional Network (Text GCN) by utilizing Semantic Web, named Text GCN-SW, and widely-used Multi-Label K-Nearest Neighbor (ML-KNN). Text GCN-SW is improved from Text GCN by adjusting the adjacency matrix of the heterogeneous word document graph with the shortest semantic distances between themes and words in metadata text. The distances are calculated with the Semantic Web of Earth and Environmental Terminology (SWEET) and WordNet dictionaries. Experiments on both the WMS and layer metadata show that the proposed methods can achieve higher F1-score and accuracy than state-of-the-art baselines, and demonstrate better stability in repeating experiments and robustness to less training data. Text GCN-SW-KNN can be extended to other multi-label text classification scenario for better supporting metadata enhancement and geospatial resource discovery in Earth Science domain.

Published in Big Earth Data

ISSN: 2096-4471 (Print); 2574-5417 (Online)
Publisher: Taylor & Francis Group
Country of publisher: United Kingdom
LCC subjects: Geography. Anthropology. Recreation; Science: Geology
Website: https://www.tandfonline.com/journals/tbed

About the journal

Abstract

Keywords