IEEE Access (Jan 2019)

Self-Citation Analysis on Google Scholar Dataset for H-Index Corrections

  • Fiaz Majeed,
  • Muhammad Shafiq,
  • Amjad Ali,
  • Muhammad Awais Hassan,
  • Syed Ali Abbas,
  • Mohammad Eid Alzahrani,
  • Muhammad Qaiser Saleem,
  • Hannan Bin Liaqat,
  • Akber Gardezi,
  • Azeem Irshad

DOI
https://doi.org/10.1109/ACCESS.2019.2938657
Journal volume & issue
Vol. 7
pp. 126025 – 126036

Abstract

Read online

For the recent decades, self-citations have been extensively studied by the academia, therein Web of Science (WoS) citations count and h-index are being considered as the benchmark parameters. The WoS is used to determine citations based on the Institute for Scientific Information's (ISI) master list. Towards this end, Google Scholar maintains a broad source of the research articles. However, Google Scholar does not exclude self-citations from the list of citations of one particular journal, author or co-author. The Google Scholar citation statistics are, therefore, not regarded as highly accurate. In this paper, we propose an updated h-index for Google Scholar by first quantifying and thereafter excluding the self-citations from the h-index. We target the two aspects of Google Scholar that belong to the evaluation of the quality of Google Scholar sources and the self-citation' records available in the citation lists. In our analysis, we have taken two datasets. The first dataset is composed of scientists awarded by the Scientometrics, which is recorded from the Google Scholar. According to this dataset, 28 scientists have been awarded as best researchers on the basis of their maximum citations and the h-indexes. The second dataset includes 16 non-award winner scientists. Both datasets include records falling in the period of 1984 to 2017. Based on analysis of award winner scientists' data, the aggregated journal self-citations are observed as 3.95%, whereas author and coauthor self-citations are found as 2.86% and 3.33%, respectively. In contrast, non-award winner scientists have average journal biased citation as 1.22%, author biased citations as 0.41% and co-author biased citations as 0.90%. We consider three types of the self-citations, i.e., journal, author, and coauthor for each scientist to cumulatively calculate the revised h-index. We obtained a new ranking of the scientists, which is based on a more accurate updated h-index. The updated h-index for the Google Scholar can be used for a more accurate academic ranking of the authors and the research articles.

Keywords