Practicability of Ensemble Artificial Neural Network Models for a Classification Task: An Optimal Approach for Reproducing Classification Practices of Health Consumers Generated Text on Social Media

Sukjin You; Min Sook Park; Soohyung Joo

doi:10.6182/jlis.202206_20(1).001

Journal of Library and Information Studies (Jun 2022)

Practicability of Ensemble Artificial Neural Network Models for a Classification Task: An Optimal Approach for Reproducing Classification Practices of Health Consumers Generated Text on Social Media

Sukjin You,
Min Sook Park,
Soohyung Joo

Affiliations

Sukjin You: ORCiD; School of Information Studies, University of Wisconsin at Milwaukee, Wisconsin, USA
Min Sook Park: School of Information Studies, University of Wisconsin at Milwaukee, Wisconsin, USA
Soohyung Joo: ORCiD; School of Information Science, University of Kentucky, Lexington, Kentucky, USA

DOI: https://doi.org/10.6182/jlis.202206_20(1).001
Journal volume & issue: Vol. 20, no. 1
pp. 1 – 30

Abstract

Read online

This paper reports the classification accuracy of artificial neural network (ANN) models in reproducing health consumers’ classification practices in social media. Social media have driven the growth of unstructured text data across domains including health, which motivates researchers to reconsider the epistemological approach to automated classification. This study compared the performance of several types of ANN models and ensemble models based on classification results and the integration of multiple ANN structures. To train these models, two dictionaries were employed: health consumers’ terms extracted from questions and answers in the health categories of Yahoo!Answers and MeSH terms. All three types of individual classifiers demonstrated accuracies of around 90%. In particular, the fully connected ANN with two layers produced relatively higher classification performances than a convolutional neural network and long short-term memory. Ensemble models based on classification results outperformed not only the ensemble models based on the integration of heterogeneous ANN structures but also individual deep-learning models. The combination of questions and best answers were found to be most effective as a training dataset to build an accurate prediction model. The findings suggest that ANN models can be an effective assistive tool in classifying online health resources generated by health consumers in natural language.

Published in Journal of Library and Information Studies

ISSN: 1606-7509 (Print)
Publisher: National Taiwan University
Country of publisher: Taiwan, Province of China
LCC subjects: Bibliography. Library science. Information resources
Website: http://jlis.lis.ntu.edu.tw/

About the journal

Abstract

Keywords