npj Computational Materials (Sep 2024)
MEPO-ML: a robust graph attention network model for rapid generation of partial atomic charges in metal-organic frameworks
Abstract
Abstract Accurate computation of the gas adsorption properties of MOFs is usually bottlenecked by the DFT calculations required to generate partial atomic charges. Therefore, large virtual screenings of MOFs often use the QEq method which is rapid, but of limited accuracy. Recently, machine learning (ML) models have been trained to generate charges in much better agreement with DFT-derived charges compared to the QEq models. Previous ML charge models for MOFs have all used training sets with less than 3000 MOFs obtained from the CoRE MOF database, which has recently been shown to have high structural error rates. In this work, we developed a graph attention network model for predicting DFT-derived charges in MOFs where the model was developed with the ARC-MOF database that contains 279,632 MOFs and over 40 million charges. This model, which we call MEPO-ML, predicts charges with a mean absolute error of 0.025e on our test set of over 27 K MOFs. Other ML models reported in the literature were also trained using the same dataset and descriptors, and MEPO-ML was shown to give the lowest errors. The gas adsorption properties evaluated using MEPO-ML charges are found to be in significantly better agreement with the reference DFT-derived charges compared to the empirical charges, for both polar and non-polar gases. Using only a single CPU core on our benchmark computer, MEPO-ML charges can be generated in less than two seconds on average (including all computations required to apply the model) for MOFs in the test set of 27 K MOFs.