Data aggregation, ML ready datasets, and an API: leveraging diverse data to create enhanced characterizations of monsoon flood risk

Dharma Hoy; Rey L. Granillo; Leland Boeman; Ben McMahan; Michael A. Crimmins

doi:10.3389/fclim.2023.1107363

Frontiers in Climate (Jul 2023)

Data aggregation, ML ready datasets, and an API: leveraging diverse data to create enhanced characterizations of monsoon flood risk

Dharma Hoy,
Rey L. Granillo,
Leland Boeman,
Ben McMahan,
Michael A. Crimmins

Affiliations

Dharma Hoy: Arizona Institute for Resilient Environments and Societies, University of Arizona, Tucson, AZ, United States
Rey L. Granillo: Arizona Institute for Resilient Environments and Societies, University of Arizona, Tucson, AZ, United States
Leland Boeman: Arizona Institute for Resilient Environments and Societies, University of Arizona, Tucson, AZ, United States
Ben McMahan: Arizona Institute for Resilient Environments and Societies, University of Arizona, Tucson, AZ, United States
Michael A. Crimmins: Department of Environmental Science, University of Arizona, Tucson, AZ, United States

DOI: https://doi.org/10.3389/fclim.2023.1107363
Journal volume & issue: Vol. 5

Abstract

Read online

Monsoon precipitation and severe flooding is highly variable and often unpredictable, with a range of flood conditions and impacts across metropolitan regions or a county. County and storm specific watches or warnings issued by the National Weather Service (NWS) alert the public to current flood conditions and risks, but floods are not limited to the area that is under alert and these zones can be relatively coarse depending on the data these warnings are based on. Research done by the Arizona Institute for Resilient Environments and Societies (AIRES) has produced an Application Programming Interface (API) accessible data warehouse of time series precipitation totals across the state of Arizona which consists of higher resolution geographically disperse data that helped create improved characterizations of monsoon precipitation variability. There is an opportunity to leverage these data to address flood risk particularly where advanced Computer Science methodologies and Machine Learning techniques may offer additional spatial and temporal insight into flood events. This can be especially useful during rainfall events where precipitation station reporting frequencies are increased and near real-time totals are accessible via the AIRES API. A Machine-Learning-ready dataset structured to train ML models facilitates an anticipatory approach to predicting/characterizing flood risk. This presents an opportunity for new inputs into management and decision making opportunities, in addition to describing precipitation and flood patterns after an event. In this paper we will be the first to make use of the AIRES API by taking the initial step of the Machine Learning process and assembling the precipitation data into a ML-ready dataset. We then look closer at the dataset assembled and call attention to characteristics of the dataset that can be further explored through machine learning processes. Finally, we will summarize future directions for research and climate services using this dataset and API.

Published in Frontiers in Climate

ISSN: 2624-9553 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Geography. Anthropology. Recreation: Environmental sciences
Website: https://www.frontiersin.org/journals/climate#

About the journal

Abstract

Keywords