Satellite and instrument entity recognition using a pre-trained language model with distant supervision

Ming Lin; Meng Jin; Yufu Liu; Yuqi Bai

doi:10.1080/17538947.2022.2107098

International Journal of Digital Earth (Dec 2022)

Satellite and instrument entity recognition using a pre-trained language model with distant supervision

Ming Lin,
Meng Jin,
Yufu Liu,
Yuqi Bai

Affiliations

Ming Lin: Tsinghua University
Meng Jin: Tsinghua University
Yufu Liu: Tsinghua University
Yuqi Bai: Tsinghua University

DOI: https://doi.org/10.1080/17538947.2022.2107098
Journal volume & issue: Vol. 15, no. 1
pp. 1290 – 1304

Abstract

Read online

Earth observations, especially satellite data, have produced a wealth of methods and results in meeting global challenges, often presented in unstructured texts such as papers or reports. Accurate extraction of satellite and instrument entities from these unstructured texts can help to link and reuse Earth observation resources. The direct use of an existing dictionary to extract satellite and instrument entities suffers from the problem of poor matching, which leads to low recall. In this study, we present a named entity recognition model to automatically extract satellite and instrument entities from unstructured texts. Due to the lack of manually labeled data, we apply distant supervision to automatically generate labeled training data. Accordingly, we fine-tune the pre-trained language model with early stopping and a weighted cross-entropy loss function. We propose the dictionary-based self-training method to correct the incomplete annotations caused by the distant supervision method. Experiments demonstrate that our method achieves significant improvements in both precision and recall compared to dictionary matching or standard adaptation of pre-trained language models.

Published in International Journal of Digital Earth

ISSN: 1753-8947 (Print); 1753-8955 (Online)
Publisher: Taylor & Francis Group
Country of publisher: United Kingdom
LCC subjects: Geography. Anthropology. Recreation: Mathematical geography. Cartography
Website: https://www.tandfonline.com/journals/tjde

About the journal

Abstract

Keywords