Atmospheric Chemistry and Physics (Jun 2024)

Unveiling the optimal regression model for source apportionment of the oxidative potential of PM<sub>10</sub>

  • V. D. Ngoc Thuy,
  • J.-L. Jaffrezo,
  • I. Hough,
  • P. A. Dominutti,
  • G. Salque Moreton,
  • G. Gille,
  • F. Francony,
  • A. Patron-Anquez,
  • O. Favez,
  • O. Favez,
  • G. Uzu

DOI
https://doi.org/10.5194/acp-24-7261-2024
Journal volume & issue
Vol. 24
pp. 7261 – 7282

Abstract

Read online

The capacity of particulate matter (PM) to generate reactive oxygen species (ROS) in vivo leading to oxidative stress is thought to be a main pathway in the health effects of PM inhalation. Exogenous ROS from PM can be assessed by acellular oxidative potential (OP) measurements as a proxy of the induction of oxidative stress in the lungs. Here, we investigate the importance of OP apportionment methods for OP distribution by PM10 sources in different types of environments. PM10 sources derived from receptor models (e.g., EPA positive matrix factorization (EPA PMF)) are coupled with regression models expressing the associations between PM10 sources and PM10 OP measured by ascorbic acid (OPAA) and dithiothreitol assay (OPDTT). These relationships are compared for eight regression techniques: ordinary least squares, weighted least squares, positive least squares, Ridge, Lasso, generalized linear model, random forest, and multilayer perceptron. The models are evaluated on 1 year of PM10 samples and chemical analyses at each of six sites of different typologies in France to assess the possible impact of PM source variability on PM10 OP apportionment. PM10 source-specific OPDTT and OPAA and out-of-sample apportionment accuracy vary substantially by model, highlighting the importance of model selection according to the datasets. Recommendations for the selection of the most accurate model are provided, encompassing considerations such as multicollinearity and homoscedasticity.