Symmetry (Jun 2020)

A Keyword-Based Literature Review Data Generating Algorithm—Analyzing a Field from Scientific Publications

  • Junchao Wang,
  • Guodong Su,
  • Chengrui Wan,
  • Xiwei Huang,
  • Lingling Sun

DOI
https://doi.org/10.3390/sym12060903
Journal volume & issue
Vol. 12, no. 6
p. 903

Abstract

Read online

A scientific review is a type of article that summarizes the current state of a specific field, which is crucial for promoting the advancement of our science community. Authors need to read hundreds of research articles to prepare the data and insights for a comprehensive review, which is time-consuming and labor-intensive. In this work, we present an algorithm that can automatically extract keywords from the meta-information of each article and generate the basic data for review articles. Two different fields—communication engineering, and lab on a chip technology—were analyzed as examples. We first built an article library by downloading all the articles from the target journal using a python-based crawler. Second, the rapid automatic keyword extraction algorithm was implemented on the title and abstract of each article. Finally, we classified all extracted keywords into class by calculating the Levenshtein distance between each of them. The results demonstrated its capability of not only finding out how communication engineering and lab on a chip were evolved in the past decades but also summarizing the analytical outcomes after data mining of the extracted keywords. Our algorithm is more than a useful tool for researchers during the preparation of a review article, it can also be applied to quantitatively analyze the past, present and help authors predict the future trend of a specific research field.

Keywords