Science and Technology of Advanced Materials: Methods (Dec 2022)
Data integration for multiple alkali metals in predicting coordination energies based on Bayesian inference
Abstract
Building machine learning models using a dataset calculated by first principles calculations is an important approach to explore the next-generation batteries. In previous studies, the rechargeable secondary battery dataset was constructed for Li ion by density functional theory (DFT) calculation, and after that, extended to five alkali metal ions. This dataset can be regarded as consisting of five alkali metal ion groups, and it is one of the interests to know which approach is preferred to build individual models specialized for each alkali metal ion or build a single model by integrating the datasets. We quantitatively evaluate the possibility of data integration in the framework of Bayesian model selection and show that the integration of datasets is suitable. In addition, extracting new knowledge using feature selection is also important in exploring next-generation batteries. In order to further advance the knowledge extraction, the reliability of the selected features should be considered to avoid misinterpretation. We evaluate the confidence level of feature selection by calculating the posterior probabilities of features using Bayesian model averaging (BMA). We found that the confidence level of feature selection increases when the number of data increases.
Keywords