Estimation of missing data of monthly rainfall in southwestern Colombia using artificial neural networks

Teresita Canchala-Nastar; Yesid Carvajal-Escobar; Wilfredo Alfonso-Morales; Wilmar Loaiza Cerón; Eduardo Caicedo

Data in Brief (Oct 2019)

Estimation of missing data of monthly rainfall in southwestern Colombia using artificial neural networks

Teresita Canchala-Nastar,
Yesid Carvajal-Escobar,
Wilfredo Alfonso-Morales,
Wilmar Loaiza Cerón,
Eduardo Caicedo

Affiliations

Teresita Canchala-Nastar: Grupo de Investigación en Ingeniería de Recursos Hídricos y Suelos (IREHISA), Escuela de Recursos Naturales y del Ambiente (EIDENAR), Facultad de Ingeniería, Universidad del Valle, Calle 13 # 100-00, Cali, Colombia; Corresponding author.
Yesid Carvajal-Escobar: Grupo de Investigación en Ingeniería de Recursos Hídricos y Suelos (IREHISA), Escuela de Recursos Naturales y del Ambiente (EIDENAR), Facultad de Ingeniería, Universidad del Valle, Calle 13 # 100-00, Cali, Colombia
Wilfredo Alfonso-Morales: Grupo de Percepción y Sistemas Inteligentes (PSI), Escuela de Ingeniería Eléctrica y Electrónica, Facultad de Ingeniería, Universidad del Valle, Calle 13 # 100-00, Cali, Colombia
Wilmar Loaiza Cerón: Departamento de Geografía, Facultad de Humanidades, Universidad del Valle, Calle 13 # 100-00, Cali, Colombia
Eduardo Caicedo: Grupo de Percepción y Sistemas Inteligentes (PSI), Escuela de Ingeniería Eléctrica y Electrónica, Facultad de Ingeniería, Universidad del Valle, Calle 13 # 100-00, Cali, Colombia

Journal volume & issue: Vol. 26

Abstract

Read online

The success of many projects linked to the management and planning of water resources depends mainly on the quality of the climatic and hydrological data that is provided. Nevertheless, the missing data are frequently found in hydroclimatic variables due to measuring instrument failures, observation recording errors, meteorological extremes, and the challenges associated with accessing measurement areas. Hence, it is necessary to apply an appropriate fill of missing data before any analysis. This paper is intended to present the filling of missing data of monthly rainfall of 45 gauge stations located in southwestern Colombia. The series analyzed covers 34 years of observations between 1983 and 2016, available from the Instituto de Hidrología, Meteorología y Estudios Ambientales (IDEAM). The estimation of missing data was done using Non-linear Principal Component Analysis (NLPCA); a non-linear generalization of the standard Principal Component Analysis Method via an Artificial Neural Networks (ANN) approach. The best result was obtained using a network with a [45−44−45] architecture. The estimated mean squared error in the imputation of missing data was approximately 9.8 mm. month−1, showing that the NLPCA approach constitutes a powerful methodology in the imputation of missing rainfall data. The estimated rainfall dataset helps reduce uncertainty for further studies related to homogeneity analyses, conglomerates, trends, multivariate statistics and meteorological forecasts in regions with information deficits such as southwestern Colombia. Keywords: Missing data, Monthly Rainfall Data, Artificial neural networks, NLPCA

Published in Data in Brief

ISSN: 2352-3409 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Science (General)
Website: http://www.journals.elsevier.com/data-in-brief/

About the journal