Information (Jun 2015)

ODQ: A Fluid Office Document Query Language

  • Xuhong Liu,
  • Ning Li,
  • Yunmei Shi,
  • Xia Hou

DOI
https://doi.org/10.3390/info6020275
Journal volume & issue
Vol. 6, no. 2
pp. 275 – 286

Abstract

Read online

Fluid office documents, as semi-structured data often represented by Extensible Markup Language (XML) are important parts of Big Data. These office documents have different formats, and their matching Application Programming Interfaces (APIs) depend on developing platform and versions, which causes difficulty in custom development and information retrieval from them. To solve this problem, we have been developing an office document query (ODQ) language which provides a uniform method to retrieve content from documents with different formats and versions. ODQ builds common document model ontology to conceal the format details of documents and provides a uniform operation interface to handle office documents with different formats. The results show that ODQ has advantages in format independence, and can facilitate users in developing documents processing systems with good interoperability.

Keywords