Biblios (Dec 2022)
Experiment of terminology extraction oriented to the construction of a library science thesaurus
Abstract
Objective. The objective of this artcicle is to evaluate two terminology extraction techniques: manual terminology extraction and automated terminology extraction, to assess the effectiveness of each process in obtaining useful terms for the construction of a library thesaurus. Method. The methodology used was exploratory-quantitative and was based on two terminology extraction experiments: (1) manual extraction and (2) automated extraction. The manual terminology extraction process was carried out by a professional with multidisciplinary academic training, while the automated terminology extraction process was carried out using WordStat program. Both, manual and automated extraction processes were based on the same corpus, consisting of 283,585 words corresponding to 59 articles about library and information science that were published in the journal Investigación Bibliotecológica during the years 2019 and 2020. Results. The results show that: manual terminology extraction provided excellent results, 82% of the terms were useful and were established as viable descriptors for the thesaurus. In comparison, automated extraction was a time-consuming process, but only 12% of the terms proved useful and were established as viable descriptors for the thesaurus. Conclusions. It was found that each of the terminology retrieval techniques was useful but presented differences. While manual extraction required a high investment of human resources and time, its results also showed high effectiveness. In contrast, automated extraction required less human investment and was fast in time, but its results in this experiment were less accurate and useful. It is concluded that experimentation with various terminology extraction techniques is important, associated with the terminology base that is the cornerstone of any controlled vocabulary.
Keywords