Infectious Diseases of Poverty (May 2021)

Infestation risk of the intermediate snail host of Schistosoma japonicum in the Yangtze River Basin: improved results by spatial reassessment and a random forest approach

  • Jin-Xin Zheng,
  • Shang Xia,
  • Shan Lv,
  • Yi Zhang,
  • Robert Bergquist,
  • Xiao-Nong Zhou

DOI
https://doi.org/10.1186/s40249-021-00852-1
Journal volume & issue
Vol. 10, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Background Oncomelania hupensis is only intermediate snail host of Schistosoma japonicum, and distribution of O. hupensis is an important indicator for the surveillance of schistosomiasis. This study explored the feasibility of a random forest algorithm weighted by spatial distance for risk prediction of schistosomiasis distribution in the Yangtze River Basin in China, with the aim to produce an improved precision reference for the national schistosomiasis control programme by reducing the number of snail survey sites without losing predictive accuracy. Methods The snail presence and absence records were collected from Anhui, Hunan, Hubei, Jiangxi and Jiangsu provinces in 2018. A machine learning of random forest algorithm based on a set of environmental and climatic variables was developed to predict the breeding sites of the O. hupensis intermediated snail host of S. japonicum. Different spatial sizes of a hexagonal grid system were compared to estimate the need for required snail sampling sites. The predictive accuracy related to geographic distances between snail sampling sites was estimated by calculating Kappa and the area under the curve (AUC). Results The highest accuracy (AUC = 0.889 and Kappa = 0.618) was achieved at the 5 km distance weight. The five factors with the strongest correlation to O. hupensis infestation probability were: (1) distance to lake (48.9%), (2) distance to river (36.6%), (3) isothermality (29.5%), (4) mean daily difference in temperature (28.1%), and (5) altitude (26.0%). The risk map showed that areas characterized by snail infestation were mainly located along the Yangtze River, with the highest probability in the dividing, slow-flowing river arms in the middle and lower reaches of the Yangtze River in Anhui, followed by areas near the shores of China’s two main lakes, the Dongting Lake in Hunan and Hubei and the Poyang Lake in Jiangxi. Conclusions Applying the machine learning of random forest algorithm made it feasible to precisely predict snail infestation probability, an approach that could improve the sensitivity of the Chinese schistosome surveillance system. Redesign of the snail surveillance system by spatial bias correction of O. hupensis infestation in the Yangtze River Basin to reduce the number of sites required to investigate from 2369 to 1747. Graphical abstract

Keywords