Journal of Cheminformatics (May 2019)

Capturing mixture composition: an open machine-readable format for representing mixed substances

  • Alex M. Clark,
  • Leah R. McEwen,
  • Peter Gedeck,
  • Barry A. Bunin

DOI
https://doi.org/10.1186/s13321-019-0357-4
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 17

Abstract

Read online

Abstract We describe a file format that is designed to represent mixtures of compounds in a way that is fully machine readable. This Mixfile format is intended to fill the same role for substances that are composed of multiple components as the venerable Molfile does for specifying individual structures. This much needed datastructure is intended to replace current practices for communicating information about mixtures, which usually relies on human-readable text descriptions, drawing several species within a single molecular diagram, or mutually incompatible ad hoc solutions. We describe an open source software application for editing mixture files, which can also be used as web-ready tools for manipulating the file format. We also present a corpus of mixture examples, which we have extracted from collections of text-based descriptions. Furthermore, we present an early look at the proposed IUPAC Mixtures InChI specification, instances of which can be automatically generated using the Mixfile format as a precursor.

Keywords