The Astrophysical Journal (Jan 2024)
Data-driven Stellar Intrinsic Colors and Dust Reddenings for Spectrophotometric Data: From the Blue-edge Method to a Machine Learning Approach
Abstract
Intrinsic colors (ICs) of stars are essential for studies on both stellar physics and dust reddening. In this work, we developed an XGBoost model to predict ICs with the atmospheric parameters T _eff , ${\rm{log}}\,g$ , and [M/H]. The model was trained and tested for three colors at Gaia and Two Micron All Sky Survey bands with 1,040,446 low-reddening sources. The atmospheric parameters were determined by the Gaia DR3 General Stellar Parameterizer from Photometry (GSP-phot) module and were validated by comparing with the Apache Point Observatory Galactic Evolution Experiment and Large Sky Area Multi-Object fiber Spectroscopic Telescope. We further confirmed that the biases in GSP-phot parameters, especially for [M/H], do not present a significant impact on the IC prediction. The generalization error of the model estimated by the test set is 0.014 mag for ${({G}_{\mathrm{BP}}\,-\,{G}_{\mathrm{RP}})}_{0}$ , 0.050 mag for ${({G}_{\mathrm{BP}}\,-\,{K}_{{\rm{S}}})}_{0}$ , and 0.040 mag for ${(J\,-\,{K}_{{\rm{S}}})}_{0}$ . The model was applied to a sample containing 5,714,528 reddened stars with stellar parameters from R. Andrae et al. to calculate ICs and reddenings. The high consistency in the comparison of E ( J − K _S ) between our results and literature values further validates the accuracy of the XGBoost model. The variation of E ( G _BP − K _S )/ E ( G _BP − G _RP ), a representation of the extinction law, with Galactic longitude is found on large scales. This work preliminarily presents the feasibility and accuracy of the machine learning approach for IC and dust reddening calculation, whose products could be widely applied to spectrophotometric data. The data sets and trained model can be accessed via Zenodo, doi:10.5281/zenodo.12787594. Models for more bands will be completed in following work.
Keywords