Construction of Legal Knowledge Graph Based on Knowledge-Enhanced Large Language Models

Jun Li; Lu Qian; Peifeng Liu; Taoxiong Liu

doi:10.3390/info15110666

Information (Oct 2024)

Construction of Legal Knowledge Graph Based on Knowledge-Enhanced Large Language Models

Jun Li,
Lu Qian,
Peifeng Liu,
Taoxiong Liu

Affiliations

Jun Li: Criminal Justice School, Zhongnan University of Economics and Law, Wuhan 430064, China
Lu Qian: The School of Transportation and Logistics Engineering, Wuhan University of Technology (WHUT), Wuhan 430063, China
Peifeng Liu: Institute for Innovation and Development, Tsinghua University, Beijing 100084, China
Taoxiong Liu: Institute for Innovation and Development, Tsinghua University, Beijing 100084, China

DOI: https://doi.org/10.3390/info15110666
Journal volume & issue: Vol. 15, no. 11
p. 666

Abstract

Read online

Legal knowledge involves multidimensional heterogeneous knowledge such as legal provisions, judicial interpretations, judicial cases, and defenses, which requires extremely high relevance and accuracy of knowledge. Meanwhile, the construction of a legal knowledge reasoning system also faces challenges in obtaining, processing, and sharing multisource heterogeneous knowledge. The knowledge graph technology, which is a knowledge organization form with triples as the basic unit, is able to efficiently transform multisource heterogeneous information into a knowledge representation form close to human cognition. Taking the automated construction of the Chinese legal knowledge graph (CLKG) as a case scenario, this paper presents a joint knowledge enhancement model (JKEM), where prior knowledge is embedded into a large language model (LLM), and the LLM is fine-tuned through the prefix of the prior knowledge data. Under the condition of freezing most parameters of the LLM, this fine-tuning scheme adds continuous deep prompts as prefix tokens to the input sequences of different layers, which can significantly improve the accuracy of knowledge extraction. The results show that the knowledge extraction accuracy of the JKEM in this paper reaches 90.92%. Based on the superior performance of this model, the CLKG is further constructed, which contains 3480 knowledge triples composed of 9 entities and 2 relationships, providing strong support for an in-depth understanding of the complex relationships in the legal field.

Published in Information

ISSN: 2078-2489 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: http://www.mdpi.com/journal/information/

About the journal

Abstract

Keywords