Journal of the Text Encoding Initiative (Sep 2016)
Formal Ontologies, Linked Data, and TEI Semantics
Abstract
The debate on the semantic role of markup languages has been quite lively and the TEI community has played an active part in it. It is commonly acknowledged that markup conveys semantic information. However, XML is a poor language for semantic data modeling. Several proposals have previously been drawn up in the past to provide XML with formalized and computable semantics. In our opinion, the formalisms offered by the Semantic Web paradigm are mature enough to build a workable semantic extension of the TEI. Our model distinguishes three semantic layers in the TEI: one general and shared intensional semantic layer; one idiolectal specialized layer; and finally an extensional semantics. Our proposal is directed toward the first two layers. We propose to build such semantic layers by adopting a set of OWL formal ontologies. Furnishing the TEI with a semantics based on a formal ontology could have interesting outcomes: facilitating the management of and research using document collections in open and multi-standard contexts; aiding interoperability with other relevant standards in the digital cultural heritage context; and providing users with advanced formal tools to semantically define their interpretations of the texts and enable innovative computational processing. In order to allow a semantic interoperability between standards, the TEI ontology has to be aligned to other models; likewise mapping and merging procedures have to be evaluated. Finally, the idea of migrating XML/TEI documents following this semantic model into a linked open data dimension requires that we face important issues in order to facilitate the data interchange in the cloud. However, the cost and the practical complexity of such an extension are notable, and several theoretical problems, format choices, and implementation details are still to be defined.
Keywords