Information (Feb 2023)
A Tissue-Specific and Toxicology-Focused Knowledge Graph
Abstract
Molecular biology-focused knowledge graphs (KGs) are directed graphs that integrate information from heterogeneous sources of biological and biomedical data, such as ontologies and public databases. They provide a holistic view of biology, chemistry, and disease, allowing users to draw non-obvious connections between concepts through shared associations. While these massive graphs are constructed using carefully curated ontologies and annotations from public databases, much of the information relating the concepts is context specific. Two important variables that determine the applicability of a given ontology annotation are the species and (especially) the tissue type in which it takes place. Using a data-driven approach and the results from thousands of high-quality gene expression samples, we have constructed tissue-specific KGs (using liver, kidney, and heart as examples) that empirically validate the annotations provided by ontology curators. The resulting human-centered KGs are designed for toxicology applications but are generalizable to other areas of human biology, addressing the issue of tissue specificity that often limits the applicability of other large KGs. These knowledge graphs can serve as valuable tools for generating transparent explanations of experimental results in the form of mechanistic hypotheses that are highly relevant to the studied tissue. Because the data-driven relations are derived from a large collection of human in vitro data, these KGs are particularly well suited for in vitro toxicology applications.
Keywords