Atmosphere (Aug 2024)

Enhanced Particle Classification in Water Cherenkov Detectors Using Machine Learning: Modeling and Validation with Monte Carlo Simulation Datasets

  • Ticiano Jorge Torres Peralta,
  • Maria Graciela Molina,
  • Hernan Asorey,
  • Ivan Sidelnik,
  • Antonio Juan Rubio-Montero,
  • Sergio Dasso,
  • Rafael Mayo-Garcia,
  • Alvaro Taboada,
  • Luis Otiniano,
  • for the LAGO Collaboration

DOI
https://doi.org/10.3390/atmos15091039
Journal volume & issue
Vol. 15, no. 9
p. 1039

Abstract

Read online

The Latin American Giant Observatory (LAGO) is a ground-based extended cosmic rays observatory designed to study transient astrophysical events, the role of the atmosphere on the formation of secondary particles, and space-weather-related phenomena. With the use of a network of Water Cherenkov Detectors (WCDs), LAGO measures the secondary particle flux, a consequence of the interaction of astroparticles impinging on the atmosphere of Earth. This flux can be grouped into three distinct basic constituents: electromagnetic, muonic, and hadronic components. When a particle enters a WCD, it generates a measurable signal characterized by unique features correlating to the particle’s type and the detector’s specific response. The resulting charge histograms from these signals provide valuable insights into the flux of primary astroparticles and their key characteristics. However, these data are insufficient to effectively distinguish between the contributions of different secondary particles. In this work, we extend our previous research by using detailed simulations of the expected atmospheric response to the primary flux and the corresponding response of our WCDs to atmospheric radiation. This dataset, which was created through the combination of the outputs of the ARTI and Meiga simulation frameworks, simulated the expected WCD signals produced by the flux of secondary particles during one day at the LAGO site in Bariloche, Argentina, situated at 865 m above sea level. This was achieved by analyzing the real-time magnetospheric and local atmospheric conditions for February and March of 2012, where the resultant atmospheric secondary-particle flux was integrated into a specific Meiga application featuring a comprehensive Geant4 model of the WCD at this LAGO location. The final output was modified for effective integration into our machine-learning pipeline. With an implementation of Ordering Points to Identify the Clustering Structure (OPTICS), a density-based clustering algorithm used to identify patterns in data collected by a single WCD, we have further refined our approach to implement a method that categorizes particle groups using advanced unsupervised machine learning techniques. This allowed for the differentiation among particle types and utilized the detector’s nuanced response to each, thus pinpointing the principal contributors within each group. Our analysis has demonstrated that applying our enhanced methodology can accurately identify the originating particles with a high degree of confidence on a single-pulse basis, highlighting its precision and reliability. These promising results suggest the feasibility of future implementations of machine-leaning-based models throughout LAGO’s distributed detection network and other astroparticle observatories for semi-automated, onboard and real-time data analysis.

Keywords