Weather and Climate Extremes (Sep 2023)

Automatized spatio-temporal detection of drought impacts from newspaper articles using natural language processing and machine learning

  • Jan Sodoge,
  • Christian Kuhlicke,
  • Mariana Madruga de Brito

Journal volume & issue
Vol. 41
p. 100574

Abstract

Read online

Droughts are expected to increase both in terms of frequency and magnitude across Europe. Despite the multitude of adverse effects these disasters impose on social-ecological systems, most impact assessments are constrained to single event and/or single sector analyses. Furthermore, existing longitudinal multi-sectoral datasets are limited in spatiotemporal homogeneity and scope, resulting in fragmented datasets. To address this gap, we propose a novel method for the automatized detection of drought impacts based on newspaper articles. We employ natural language processing (NLP) and machine learning to identify different socio-economic impacts (e.g. agriculture, forestry, livestock, fires) and their geographic and temporal scope from 40,000 newspaper articles reporting about droughts in Germany between 2000 and 2021. Our method is able to track impacts over long time periods, allowing us to assess how drought impacts evolve. Accuracy levels of 92–96% per impact class were obtained for the automatic classification of the impacts when evaluated on a human-annotated dataset. Furthermore, our resulting impact dataset can replicate both temporal and spatial trends when validated against independent impact and hazard data. Overall, the proposed approach advances current research as it (1) requires a significantly lower workload than conventional impact assessment methods, (2) allows addressing large text datasets, (3) reduces subjectivity and human bias, (4) is generalizable to other hazard types as well as text corpora, and (5) achieves sufficient levels of accuracy. The findings highlight the applicability of NLP and machine learning to create comprehensive longitudinal impact datasets.

Keywords