ISPRS International Journal of Geo-Information (Mar 2024)

Quantifying Urban Linguistic Diversity Related to Rainfall and Flood across China with Social Media Data

  • Jiale Qian,
  • Yunyan Du,
  • Fuyuan Liang,
  • Jiawei Yi,
  • Nan Wang,
  • Wenna Tu,
  • Sheng Huang,
  • Tao Pei,
  • Ting Ma

DOI
https://doi.org/10.3390/ijgi13030092
Journal volume & issue
Vol. 13, no. 3
p. 92

Abstract

Read online

Understanding the public’s diverse linguistic expressions about rainfall and flood provides a basis for flood disaster studies and enhances linguistic and cultural awareness. However, existing research tends to overlook linguistic complexity, potentially leading to bias. In this study, we introduce a novel algorithm capturing rainfall and flood-related expressions, considering the relationship between precipitation observations and linguistics expressions. Analyzing 210 million social media microblogs from 2017, we identified 594 keywords, 20 times more than usual manually created bag-of-words. Utilizing Large Language Model, we categorized these keywords into rainfall, flood, and other related terms. Semantic features of these keywords were analyzed from the viewpoint of popularity, credibility, time delay, and part-of-speech, finding rainfall-related terms most common-used, flood-related keywords often more time delayed than precipitation, and notable differences in part-of-speech across categories. We also assessed spatial characteristics from keyword and city-centric perspectives, revealing that 49.5% of the keywords have significant spatial correlation with differing median centers, reflecting regional variations. Large and disaster-impacted cities show the richest expression diversity for rainfall and flood-related terms.

Keywords