Deep Learning Based Biomedical Literature Classification Using Criteria of Scientific Rigor

Muhammad Afzal; Beom Joo Park; Maqbool Hussain; Sungyoung Lee

doi:10.3390/electronics9081253

Electronics (Aug 2020)

Deep Learning Based Biomedical Literature Classification Using Criteria of Scientific Rigor

Muhammad Afzal,
Beom Joo Park,
Maqbool Hussain,
Sungyoung Lee

Affiliations

Muhammad Afzal: Department of Software, Sejong University, Seoul 05006, Korea
Beom Joo Park: Ubiquitous Computing Lab, Department of Computer Science and Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu, Yongin-si, Gyeonggi-do 446-701, Korea
Maqbool Hussain: Department of Software, Sejong University, Seoul 05006, Korea
Sungyoung Lee: Ubiquitous Computing Lab, Department of Computer Science and Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu, Yongin-si, Gyeonggi-do 446-701, Korea

DOI: https://doi.org/10.3390/electronics9081253
Journal volume & issue: Vol. 9, no. 8
p. 1253

Abstract

Read online

A major blockade to support the evidence-based clinical decision-making is accurately and efficiently recognizing appropriate and scientifically rigorous studies in the biomedical literature. We trained a multi-layer perceptron (MLP) model on a dataset with two textual features, title and abstract. The dataset consisting of 7958 PubMed citations classified in two classes: scientific rigor and non-rigor, is used to train the proposed model. We compare our model with other promising machine learning models such as Support Vector Machine (SVM), Decision Tree, Random Forest, and Gradient Boosted Tree (GBT) approaches. Based on the higher cumulative score, deep learning was chosen and was tested on test datasets obtained by running a set of domain-specific queries. On the training dataset, the proposed deep learning model obtained significantly higher accuracy and AUC of 97.3% and 0.993, respectively, than the competitors, but was slightly lower in the recall of 95.1% as compared to GBT. The trained model sustained the performance of testing datasets. Unlike previous approaches, the proposed model does not require a human expert to create fresh annotated data; instead, we used studies cited in Cochrane reviews as a surrogate for quality studies in a clinical topic. We learn that deep learning methods are beneficial to use for biomedical literature classification. Not only do such methods minimize the workload in feature engineering, but they also show better performance on large and noisy data.

Published in Electronics

ISSN: 2079-9292 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics
Website: http://www.mdpi.com/journal/electronics

About the journal

Abstract

Keywords