Textual geolocation in Hebrew: mapping challenges via natural place description analysis

Tal Bauman; Tzuf Paz-Argaman; Itai Mondshine; Reut Tsarfaty; Itzhak Omer; Sagi Dalyot

doi:10.5311/JOSIS.2024.28.323

Journal of Spatial Information Science (Jun 2024)

Textual geolocation in Hebrew: mapping challenges via natural place description analysis

Tal Bauman,
Tzuf Paz-Argaman,
Itai Mondshine,
Reut Tsarfaty,
Itzhak Omer,
Sagi Dalyot

Affiliations

Tal Bauman: ORCiD; The Technion
Tzuf Paz-Argaman: Ber-Ilan University
Itai Mondshine: Bar Ilan University
Reut Tsarfaty: Ber-Ilan University
Itzhak Omer: ORCiD; Tel Aviv University
Sagi Dalyot: ORCiD; The Technion

DOI: https://doi.org/10.5311/JOSIS.2024.28.323
Journal volume & issue: no. 28
pp. 103 – 127

Abstract

Read online

Describing where a place is situated is an innate communication skill that relies on spatial cognition, spatial reasoning, and linguistic systems. Accordingly, textual geolocation, a task for retrieving the coordinates of a place from linguistic descriptions, requires computerized spatial inference and natural language understanding. Yet, machine-based textual geolocation is currently limited, mainly due to the lack of rich geo-textual datasets necessitated to train natural language models that, in-turn, cannot adequately interpret the language-based expressions. These limitations are intensified in morphologically rich and resource-poor languages, such as Hebrew. This study aims to analyze and understand the linguistic systems used for place descriptions in Hebrew, later to be used to train machine learning natural language models. A novel crowdsourced geo-textual dataset is developed, composed of 5,695 written place descriptions provided by 1,554 native Hebrew speakers. All place descriptions rely on memory only, which increases spatial vagueness and requires referring expression resolution. Qualitative linguistic analysis of place descriptions shows that geospatial reasoning is greatly used in Hebrew, while empirical analysis with textual geolocation engines indicates that literal descriptions pose challenges for existing methods, as they require real understanding of space and geospatial references and cannot simply be geolocated by matching gazetteer with textual geo-entity extractions. The findings offer improved understanding of the challenges entailed in natural language processing of Hebrew geolocation, contributing to formalizing computerized systems used in future machine learning models for complex geographic information retrieval tasks.

Published in Journal of Spatial Information Science

ISSN: 1948-660X (Online)
Publisher: University of Maine
Country of publisher: Australia
LCC subjects: Geography. Anthropology. Recreation: Geography (General)
Website: http://www.josis.org

About the journal

Abstract

Keywords