Iranian Journal of Information Processing & Management (Sep 2018)

Consistency between Descriptors, Author-Supported Keywords and Tags in the ERIC and Mendeley Database

  • Maryam Ghanavati,
  • Alireza Noruzi,
  • Maryam Nakhoda,
  • Ashkan Khatir

Journal volume & issue
Vol. 33, no. 4
pp. 1715 – 1736

Abstract

Read online

The purpose of this study was to identify the language consistency between indexers, authors and taggers in the ERIC and Mendeley databases. This survey was conducted using content analysis methods and techniques to evaluate the language consistency between indexers, authors and taggers in the ERIC and Mendeley databases and also to determine common keywords. The sample for this study was comprised of top twenty journals in the field of Educational Research based on the Journal Citation Reports (JCR) of Web of Science, indexed in the ERIC database in 2014. Finally 499 articles published in the above-mentioned journals in 2014 were chosen as the sample base for the dataset. Note that only articles with author-supported keywords, indexed in the ERIC database and also tagged in the Mendeley database from January 2014 to August 2016 were eligible to be assessed. Descriptors assigned to the articles on the ERIC database and tags associated to the articles on the Mendeley database for the period from January 2014 to August 2016 were extracted. Also author-assigned keywords assigned to all 499 articles were collected. Finally we created a software based on object-oriented programming (OOP) in C++ to analyze the search results. Descriptive statistics and measures, and thesaural term comparison show that there are important differences in the context of keywords from the three groups. This study demonstrated that there were differences between the tagger, author and professional indexer views of the words used as tags, descriptors, or author-assigned keywords. The results showed that the consistency between the author-supported keywords and user tags of the 499 articles in the Mendeley was 15 percent; while the consistency between descriptors designated to the articles in the ERIC database and user tags associated to the articles on the Mendeley were three percent. On the other hand, the consistency between descriptors assigned to the articles in the ERIC database and the author-assigned keywords were 4 percent. Finally, the language consistency between the three above-mentioned groups was 1.1 percent. Also note that the presence of descriptors in the ERIC thesaurus was 34 percent, which were more than the author-supported keywords and tags. The findings showed that the consistency between the keywords used by authors and taggers were more than the keywords chosen by indexers and authors, and by indexers and taggers. This means that three sides of the information representation triangle, i.e., indexer, author and tagger are unfamiliar with each other’s language. It is worth noting that tags are useful supplements to controlled vocabularies, since the former provide a means for social organization of knowledge outside the framework of the latter. The low consistency between tags and descriptors in this research indicates that Mendeley users do not use the same terminology as subject specialists who maintain descriptors in the ERIC thesaurus. Further research involving semantic analysis of Mendeley tags may reveal an emerging vocabulary suitable for inclusion in the ERIC thesaurus as a controlled vocabulary.

Keywords