Atmospheric Measurement Techniques (Jul 2022)
Automated identification of local contamination in remote atmospheric composition time series
Abstract
Atmospheric observations in remote locations offer a possibility of exploring trace gas and particle concentrations in pristine environments. However, data from remote areas are often contaminated by pollution from local sources. Detecting this contamination is thus a central and frequently encountered issue. Consequently, many different methods exist today to identify local contamination in atmospheric composition measurement time series, but no single method has been widely accepted. In this study, we present a new method to identify primary pollution in remote atmospheric datasets, e.g., from ship campaigns or stations with a low background signal compared to the contaminated signal. The pollution detection algorithm (PDA) identifies and flags periods of polluted data in five steps. The first and most important step identifies polluted periods based on the derivative (time derivative) of a concentration over time. If this derivative exceeds a given threshold, data are flagged as polluted. Further pollution identification steps are a simple concentration threshold filter, a neighboring points filter (optional), a median, and a sparse data filter (optional). The PDA only relies on the target dataset itself and is independent of ancillary datasets such as meteorological variables. All parameters of each step are adjustable so that the PDA can be “tuned” to be more or less stringent (e.g., flag more or fewer data points as contaminated). The PDA was developed and tested with a particle number concentration dataset collected during the Multidisciplinary drifting Observatory for the Study of Arctic Climate (MOSAiC) expedition in the central Arctic. Using strict settings, we identified 62 % of the data as influenced by local contamination. Using a second independent particle number concentration dataset also collected during MOSAiC, we evaluated the performance of the PDA against the same dataset cleaned by visual inspection. The two methods agreed in 94 % of the cases. Additionally, the PDA was successfully applied to a trace gas dataset (CO2), also collected during MOSAiC, and to another particle number concentration dataset, collected at the high-altitude background station Jungfraujoch, Switzerland. Thus, the PDA proves to be a useful and flexible tool to identify periods affected by local contamination in atmospheric composition datasets without the need for ancillary measurements. It is best applied to data representing primary pollution. The user-friendly and open-access code enables reproducible application to a wide suite of different datasets. It is available at https://doi.org/10.5281/zenodo.5761101 (Beck et al., 2021).