IEEE Access (Jan 2021)
Lexical Simplification System to Improve Web Accessibility
Abstract
People with intellectual, language and learning disabilities face accessibility barriers when reading texts with complex words. Following accessibility guidelines, complex words can be identified, and easy synonyms and definitions can be provided for them as reading aids. To offer support to these reading aids, a lexical simplification system for Spanish has been developed and is presented in this article. The system covers the complex word identification (CWI) task and offers replacement candidates with the substitute generation and selection (SG/SS) task. These tasks have followed machine learning techniques and contextual embeddings using Easy Reading and Plain Language resources, such as dictionaries and corpora. Additionally, due to the polysemy present in the language, the system provides definitions for complex words, which are disambiguated by a rule-based method supported by a state-of-the-art embedding resource. This system is integrated into a web system that provides an easy way to improve the readability and comprehension of Spanish texts. The results obtained are satisfactory; in the CWI task, better results were obtained than with other systems that used the same dataset. The SG/SS task results are comparable to similar works in the English language and provide a solid starting point to improve this task for the Spanish language. Finally, the results of the disambiguation process evaluation were good when evaluated by a linguistic expert. These findings represent an additional advancement in the lexical simplification of texts in Spanish and in a generic domain using easy-to-read resources, among others, to provide systematic support to compliance with accessibility guidelines.
Keywords