Applied Sciences (Oct 2021)

SPUCL (Scientific Publication Classifier): A Human-Readable Labelling System for Scientific Publications

  • Noemi Scarpato,
  • Alessandra Pieroni,
  • Michela Montorsi

DOI
https://doi.org/10.3390/app11199154
Journal volume & issue
Vol. 11, no. 19
p. 9154

Abstract

Read online

To assess critically the scientific literature is a very challenging task; in general it requires analysing a lot of documents to define the state-of-the-art of a research field and classifying them. The documents classifier systems have tried to address this problem by different techniques such as probabilistic, machine learning and neural networks models. One of the most popular document classification approaches is the LDA (Latent Dirichlet Allocation), a probabilistic topic model. One of the main issues of the LDA approach is that the retrieved topics are a collection of terms with their probabilities and it does not have a human-readable form. This paper defines an approach to make LDA topics comprehensible for humans by the exploitation of the Word2Vec approach.

Keywords