工程科学学报 (Jan 2024)
Contract text markup language: A regularization method for extracting legal elements towards smart contracts
Abstract
The importance of smart contracts at the legal level is increasing. However, the contract is written in natural language, and the computers cannot process it directly. Thus, accurately understanding contract content and meaning representation remains challenging. This problem leads to a lack of regularization in generating smart contract programs and legal recognition and effectiveness. Therefore, it is necessary to develop a new approach to transform real-life legal contracts into smart contract programs and ensure the regularization of legal element extraction and program conversion. In this paper, we propose a contract text markup language (CTML), which is a normative computer processing language for expressing meaning in legal contracts. A method for regulating the content and meaning representation of legal contract text is established by annotating the content and meaning representation of the syntax, structure, and vocabulary in the contract using CTML to achieve the extraction and conversion of contract elements. First, a contract metamodel of CTML, which includes a three-layer “element−property−component” semantic structure and metadata markup representation, is established. Thereafter, the contract text information is gradually refined from “large to small” and “coarse to fine” to build the corresponding relationship from real-life contracts to smart legal contracts. Furthermore, the syntax of CTML is designed such that the legal elements can be extracted and regularized to form an annotated contract using CTML. Second, we designed specific conversion rules from CTML to smart legal contract language to generate smart legal contract programs by recursively abstracting syntax trees and establishing a mapping relationship. These rules help users write contracts, improve the efficiency of converting contract text to executable code, and ensure that smart legal contracts are written on solid grounds, thereby improving the conversion chain from legal contracts to executable smart contracts. In addition, considering a factoring contract as an example, we illustrated the details of semantic extraction and code generation. Accordingly, the contract semantic extraction is clearer, the conversion is more normative, and the code development is more effective. Thus, the proposed CTML provides an alternative regularization method to generate smart legal contracts.
Keywords