Scientific Reports (Jun 2022)

Correlation between air pollution and prevalence of conjunctivitis in South Korea using analysis of public big data

  • Sanghyu Nam,
  • Mi Young Shin,
  • Jung Yeob Han,
  • Su Young Moon,
  • Jae Yong Kim,
  • Hungwon Tchah,
  • Hun Lee

DOI
https://doi.org/10.1038/s41598-022-13344-5
Journal volume & issue
Vol. 12, no. 1
pp. 1 – 9

Abstract

Read online

Abstract This study investigated how changes in weather factors affect the prevalence of conjunctivitis using public big data in South Korea. A total of 1,428 public big data entries from January 2013 to December 2019 were collected. Disease data and basic climate/air pollutant concentration records were collected from nationally provided big data. Meteorological factors affecting eye diseases were identified using multiple linear regression and machine learning analysis methods such as extreme gradient boosting (XGBoost), decision tree, and random forest. The prediction model with the best performance was XGBoost (1.180), followed by multiple regression (1.195), random forest (1.206), and decision tree (1.544) when using root mean square error (RMSE) values. With the XGBoost model, province was the most important variable (0.352), followed by month (0.289) and carbon monoxide exposure (0.133). Other air pollutants including sulfur dioxide, PM10, nitrogen dioxides, and ozone showed low associations with conjunctivitis. We identified factors associated with conjunctivitis using traditional multiple regression analysis and machine learning techniques. Regional factors were important for the prevalence of conjunctivitis as well as the atmosphere and air quality factors.