Iraqi Journal for Computers and Informatics (Jun 2024)
Identifying Researchers’ Interest using Text Mining
Abstract
Researchers' interests and academic journals are crucial for advancing scientific inquiry. Journals serve as platforms for sharing and validating discoveries, fostering a symbiotic relationship that advances our collective understanding and pushes the boundaries of human knowledge. Journals, which encompass natural edge research and establish benchmarks for academic rigor. In this paper, an analysis, using text mining, of the publications of Iraqi researchers in scientific journals is used to extract the researcher's interest. In more detail, this paper utilizes the following technologies: pre-processing (tokenization, POS (“Part Of Speech”), normalization, case folding, lemmatization) – filtering (stop word elimination) - feature Extraction (TF-IDF), as well as classification using deep neural network classifier (DNNC), to address the problem of identifying the researcher's interests through texts (title &abstract) analysis. The Iraqi researchers’ data in the field of computer science from the years 2010-2022. As obtained from the Scopus repository, a total of 1170 papers were collected via API- key and scrubber depending on the keyword of computer science and the year. Furthermore, these papers were manually classified based on the hierarchical classification of the ACM journal. Finally, the best results obtained from a classification using DNN and TF-IDF as classifying terms achieved a precision of 90%, Recall of 90%, f1-score of 90%, and accuracy of 90%.
Keywords