The Astrophysical Journal Supplement Series (Jan 2025)
Application of Machine Learning to Background Rejection in Very-high-energy Gamma-Ray Observation
Abstract
Identifying gamma rays and rejecting the background of cosmic-ray hadrons are crucial for very-high-energy gamma-ray observations and relevant scientific research. Based on the simulated data from the square kilometer array (KM2A) of LHAASO, eight high-level features were extracted for the gamma/hadron classification. Machine learning (ML) models, including logistic regression, support vector machines, decision trees, random forests, XGBoost, CatBoost, and deep neural networks (DNN) were constructed and trained using data sets of four energy bands ranging from 10 ^12 to 10 ^16 eV, and finally fused using the stacking ensemble algorithm. To comprehensively assess the classification ability of each model, the accuracy, F1 score, precision, recall, and area under the curve value of the receiver operating characteristic curve were used. The results show that the ML methods have a significant improvement on particle classification in LHAASO-KM2A, particularly in the low-energy range. Among these methods, XGBoost, CatBoost, and DNN demonstrate stronger classification capabilities than decision trees and random forests, while the fusion model exhibits the best discriminatory ability. The ML methods provide a useful and alternative method for gamma/hadron identification. The codes used in this paper are available at Zenodo at doi: http://dx.doi.org/10.5281/zenodo.13623261 .
Keywords