Intelligent Systems with Applications (Feb 2023)
Leveraging hybrid machine learning and data fusion for accurate mapping of malaria cases using meteorological variables in western India
Abstract
We propose a hybrid machine learning algorithm (i.e., P2CA−PSO−ANN) to model malaria outbreak in three districts (Barmer, Bikaner, and Jodhpur) of Rajasthan in the Western India. We have used different meteorological variables (i.e., relative humidity, temperature, and rainfall) as input features to predict malaria. We have also considered the combined impact of these variables through a linear data fusion. We then extract the uncorrelated information from the feature set by applying Probabilistic Principal Component Analysis (P2CA). We trained the fully connected feed-forward Artificial Neural Network (ANN) by optimising its hyperparameters iteratively through a bio-inspired optimisation algorithm (Particle Swarm Optimisation). We train and evaluate the performance of this algorithm using monthly meteorological variables from 2009 - 2012. This accurately predicts the malaria cases with the coefficient of correlation (R = 0.99), and Root Mean Square Error (RMSE = 1.76). Finally, we compare our model with different benchmark algorithms (Generalised Regression Neural Network (GRNN), Gaussian Process Regression (GPR), Support Vector Regression (SVR), Random Forest, and Radial Basis Neural Networks (RBNN)) in terms of accuracy. We observed the performance of hybrid machine learning model relatively high. This study can be used as an early warning intelligent system to predict the malaria outbreaks solely from meteorological data.