Journal of the Text Encoding Initiative (Feb 2012)
Transforming Backward
Abstract
The standard workflow for preparing digital editions for display involves writing XSL to transform handcrafted TEI into either 1) HTML for the web or 2) XSL-FO for conversion into a print friendly format such as PDF. With either method we implicitly recognize that TEI, even coupled with CSS, is not designed as a presentation technology. Many born-digital documents, however, are encoded in formats that are, such as HTML. Hypothetical future editions of such documents would most likely need to be supplemented by a document description that goes beyond the facilities of HTML to meet the needs of editors. Thus we foresee cases where born-HTML documents could be supplemented and described by TEI in much the same way as TEI currently supplements and describes manuscripts and printed books. In this paper we investigate ways that XHTML documents both with and without RDFa can be “transformed backward” into TEI. In addition to the digital edition use case, we also investigate a process for converting HTML content to TEI-based language corpora.
Keywords