IEEE Access (Jan 2024)
A Systematic Review on Semantic Role Labeling for Information Extraction in Low-Resource Data
Abstract
Challenges in the big data phenomenon arise due to the existence of unstructured text data, which is very large, comes from various sources, has various formats, and contains much noise. The complexity of unstructured text data makes it difficult to extract useful information. Therefore, a process is needed to transform it into structured data to be processed further. The information Extraction (IE) process helps to extract relationships, entities, semantic roles, and events from unstructured text data by converting them into structured output. One of IE’s tasks is Semantic Role Labeling (SRL), which has a crucial function in identifying semantic roles in a sentence so that it can enrich the understanding of the text. However, much of SRL development focuses on high-resource data, especially in English. The limited development of SRL in specific low-resource languages or domains is a complex challenge. This research aims to conduct a systematic study on the development of SRL for low-resource data, both in low-resource language or domain-specific contexts. The review process was carried out systematically using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) model, and 54 quality papers were obtained from the filtering process (from 2018 to 2023). We review several essential points, including (1) datasets that are often used for SRL tasks and their labeling strategies for low-resource data, (2) methods that have currently been developed for SRL tasks and learning scenarios when dealing with low-resource data, (4) evaluation metrics, (5) application of SRL tasks. This review is complemented by a discussion of issues and potential solutions for developing SRL on low-resource data to help researchers develop SRL more effectively in dealing with the challenges faced with low-resource data.
Keywords