Research Ideas and Outcomes (Aug 2023)
From implementation to application: FAIR digital objects for training data composition
Abstract
Read online Read online Read online
Composing training data for Machine Learning applications can be laborious and time-consuming when done manually. The use of FAIR Digital Objects, in which the data is machine-interpretable and -actionable, makes it possible to automate and simplify this task. As an application case, we represented labeled Scanning Electron Microscopy images from different sources as FAIR Digital Objects to compose a training data set. In addition to some existing services included in our implementation (the Typed-PID Maker, the Handle Registry, and the ePIC Data Type Registry), we developed a Python client to automate the relabeling task. Our work provides a Proof-of-Concept validation for the usefulness of FAIR Digital Objects on a specific task, facilitating further developments and future extensions to other machine learning applications.
Keywords