Frontiers in Neuroinformatics (Aug 2011)
A bottom-up approach to data annotation in neurophysiology
Abstract
Metadata providing information about the stimulus, data acquisition, and experimentalconditions are indispensable for the analysis and management of experimental data withina lab. However, only rarely are metadata available in a structured, comprehensive, andmachine-readable form. This poses a severe problem for finding and retrieving data, bothin the laboratory and on the various emerging public data bases. Here, we propose a simpleformat, the Open metaData Markup Language (od ML), for collecting and exchangingmetadata in an automated, computer-based fashion. In od ML arbitrary metadata informa-tion is stored as extended key-value pairs in a hierarchical structure. Central to od ML isa clear separation of format and content, i.e. neither keys nor values are defined by theformat. This makes od ML flexible enough for storing all available metadata instantly with-out the necessity to submit new keys to an ontology or controlled terminology. Commonstandard keys can be defined in od ML terminologies for guaranteeing interoperability. Westarted to define such terminologies for neurophysiological data, but aim at a communitydriven extension and refinement of the proposed definitions. By customized terminologiesthat map to these standard terminologies, metadata can be named and organized as requiredor preferred without softening the standard. Together with the respective libraries providedfor common programming languages, the od ML format can be integrated into the labora-tory workflow, facilitating automated collection of metadata information where it becomesavailable. The flexibility of od ML also encourages a community driven collection anddefinition of terms used for annotating data in the neurosciences.
Keywords