Journal of Library and Information Studies (Jun 2022)

Practicability of Ensemble Artificial Neural Network Models for a Classification Task: An Optimal Approach for Reproducing Classification Practices of Health Consumers Generated Text on Social Media

  • Sukjin You,
  • Min Sook Park,
  • Soohyung Joo

DOI
https://doi.org/10.6182/jlis.202206_20(1).001
Journal volume & issue
Vol. 20, no. 1
pp. 1 – 30

Abstract

Read online

This paper reports the classification accuracy of artificial neural network (ANN) models in reproducing health consumers’ classification practices in social media. Social media have driven the growth of unstructured text data across domains including health, which motivates researchers to reconsider the epistemological approach to automated classification. This study compared the performance of several types of ANN models and ensemble models based on classification results and the integration of multiple ANN structures. To train these models, two dictionaries were employed: health consumers’ terms extracted from questions and answers in the health categories of Yahoo!Answers and MeSH terms. All three types of individual classifiers demonstrated accuracies of around 90%. In particular, the fully connected ANN with two layers produced relatively higher classification performances than a convolutional neural network and long short-term memory. Ensemble models based on classification results outperformed not only the ensemble models based on the integration of heterogeneous ANN structures but also individual deep-learning models. The combination of questions and best answers were found to be most effective as a training dataset to build an accurate prediction model. The findings suggest that ANN models can be an effective assistive tool in classifying online health resources generated by health consumers in natural language.

Keywords