Jisuanji kexue yu tansuo (Jul 2024)

Submodular Optimization Approach for Entity Summarization in Knowledge Graph Driven by Large Language Models

  • ZHANG Qi, ZHONG Hao

DOI
https://doi.org/10.3778/j.issn.1673-9418.2305086
Journal volume & issue
Vol. 18, no. 7
pp. 1806 – 1813

Abstract

Read online

The continuous expansion of the knowledge graph has made entity summarization a research hotspot. The goal of entity summarization is to obtain a brief description of an entity from large-scale triple-structured facts that describe it. The research aims to propose a submodular optimization method for entity summarization based on a large language model. Firstly, based on the descriptive information of entities, relationships, and properties in the triples, a large language model is used to embed them to vectors, effectively capturing the semantic information of the triples and generating embedding vectors containing rich semantic information. Secondly, based on the embedding vectors generated by the large language model, a method is defined to characterize the relevance between any two triples that describe the same entity. The higher the relevance between any two triples, the more similar the information contained in these two triples. Finally, based on the defined method for characterizing triple relevance, a normalized and monotonically non-decreasing submodular function is defined, modeling entity summarization as a submodular function maximization problem. Therefore, greedy algorithms with performance guarantees can be directly applied to extracting entity summaries. Testing is conducted on three public benchmark datasets, and the quality of the extracted entity summaries is evaluated using two metrics, F1 score and NDCG (normalized discounted cumulative gain). Experimental results show that the proposed approach significantly outperforms the state-of-the-art method.

Keywords