BMC Medical Informatics and Decision Making (Jul 2017)
Lightweight predicate extraction for patient-level cancer information and ontology development
Abstract
Abstract Background Knowledge engineering for ontological knowledgebases is resource and time intensive. To alleviate these issues, especially for novices, automated tools from the natural language domain can assist in the development process of ontologies. We focus towards the development of ontologies for the public health domain and use patient-centric sources from MedlinePlus related to HPV-causing cancers. Methods This paper demonstrates the use of a lightweight open information extraction (OIE) tool to derive accurate knowledge triples that can lead to the seeding of an ontological knowledgebase. We developed a custom application, which interfaced with an information extraction software library, to help facilitate the tasks towards producing knowledge triples from textual sources. Results The results of our efforts generated accurate extractions ranging from 80–89% precision. These triples can later be transformed to OWL/RDF representation for our planned ontological knowledgebase. Conclusions OIE delivers an effective and accessible method towards the development ontologies.
Keywords