From Vision to Content: Construction of Domain-Specific Multi-Modal Knowledge Graph

Xiaoming Zhang; Xiaoling Sun; Chunjie Xie; Bing Lun

doi:10.1109/ACCESS.2019.2933370

IEEE Access (Jan 2019)

From Vision to Content: Construction of Domain-Specific Multi-Modal Knowledge Graph

Xiaoming Zhang,
Xiaoling Sun,
Chunjie Xie,
Bing Lun

Affiliations

Xiaoming Zhang: School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang, China
Xiaoling Sun: ORCiD; School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang, China
Chunjie Xie: School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang, China
Bing Lun: School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang, China

DOI: https://doi.org/10.1109/ACCESS.2019.2933370
Journal volume & issue: Vol. 7
pp. 108278 – 108294

Abstract

Read online

Knowledge graphs are usually constructed to describe the various concepts that exist in real world as well as the relationships between them. There are many knowledge graphs in specific fields, but they usually pay more attention on text or structured data, ignoring the image vision information, and cannot play an adequate role in the emerging visualization applications. Aiming at this issue, we design a method that integrates image vision information and text information derived from Wikimedia Commons to construct a domain-specific multi-modal knowledge graph, taking the metallic materials domain as an example to illustrate the method. The text description of each image is regarded as its context semantic to acquire the image's context semantic labels based on the DBpedia resource. Furthermore, we adopt deep neural network model instead of simple visual descriptors to acquire the image's visual semantic labels using the concepts from WordNet. In order to fuse the visual semantic labels and context semantic labels, a path-based concept extension and fusion strategy is proposed based on the conceptual hierarchies of WordNet and DBpedia to obtain the effective extension concepts as well as the links between them, increasing the scale of the knowledge graph and enhancing the correlation between images. The experimental results show that the maximum extension level has a significant impact on the quality of the generated domain knowledge graph, and the best extension level number is respectively determined for both DBpedia and WordNet. In addition, the results of this paper are compared with IMGpedia to further show the effectiveness of the proposed method.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords