Journal of Cheminformatics (Apr 2023)
Principles and requirements for nanomaterial representations to facilitate machine processing and cooperation with nanoinformatics tools
Abstract
Abstract Efficient and machine-readable representations are needed to accurately identify, validate and communicate information of chemical structures. Many such representations have been developed (as, for example, the Simplified Molecular-Input Line-Entry System and the IUPAC International Chemical Identifier), each offering advantages specific to various use-cases. Representation of the multi-component structures of nanomaterials (NMs), though, remains out of scope for all the currently available standards, as the nature of NMs sets new challenges on formalizing the encoding of their structure, interactions and environmental parameters. In this work we identify a set of principles that a NM representation should adhere to in order to provide “machine-friendly” encodings of NMs, i.e. encodings that facilitate machine processing and cooperation with nanoinformatics tools. We illustrate our principles by showing how the recently introduced InChI-based NM representation, might be augmented, in principle, to also encode morphology and mixture properties, distributions of properties, and also to capture auxiliary information and allow data reuse.
Keywords