npj Computational Materials (Mar 2022)

Distributed representations of atoms and materials for machine learning

  • Luis M. Antunes,
  • Ricardo Grau-Crespo,
  • Keith T. Butler

DOI
https://doi.org/10.1038/s41524-022-00729-3
Journal volume & issue
Vol. 8, no. 1
pp. 1 – 9

Abstract

Read online

Abstract The use of machine learning is becoming increasingly common in computational materials science. To build effective models of the chemistry of materials, useful machine-based representations of atoms and their compounds are required. We derive distributed representations of compounds from their chemical formulas only, via pooling operations of distributed representations of atoms. These compound representations are evaluated on ten different tasks, such as the prediction of formation energy and band gap, and are found to be competitive with existing benchmarks that make use of structure, and even superior in cases where only composition is available. Finally, we introduce an approach for learning distributed representations of atoms, named SkipAtom, which makes use of the growing information in materials structure databases.