Cost-Effective Knowledge Extraction Framework for Low-Resource Environments

Sangha Nam; Eun-Kyung Kim

doi:10.1109/ACCESS.2024.3394906

IEEE Access (Jan 2024)

Cost-Effective Knowledge Extraction Framework for Low-Resource Environments

Sangha Nam,
Eun-Kyung Kim

Affiliations

Sangha Nam: Language AI Laboratory, NCSoft, Seongnam, Republic of Korea
Eun-Kyung Kim: ORCiD; Department of Big Data and AI, Daejeon University, Daejeon, Republic of Korea

DOI: https://doi.org/10.1109/ACCESS.2024.3394906
Journal volume & issue: Vol. 12
pp. 60668 – 60681

Abstract

Read online

Extracting knowledge from texts is crucial for enriching everyday knowledge. Constructing a knowledge extraction environment requires comprehensive processes, such as data generation, data processing, and model and framework design. However, these processes require significant effort in low-resource environments where shared data are not published. Currently, there is no environment that can design an entire knowledge extraction framework and perform step-by-step experiments even with unlimited resources. Thus, this study proposes a method for building a cost-effective knowledge extraction environment. In particular, we present a low-cost, high-quality method for annotating a corpus for knowledge extraction, in which data sharing is unavailable. The dataset collected using this method improves the performance of knowledge-extraction system models. Specifically, the co-reference resolution and relation extraction performance were improved by 10% and 18.9%, respectively. Additionally, the entire knowledge extraction system was evaluated using sequential multitask learning, and the performance was improved by 5% as each trained model was introduced.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords