MethodsX (Jun 2024)

Detecting health misinformation: A comparative analysis of machine learning and graph convolutional networks in classification tasks

  • Bharti Khemani,
  • Shruti Patil,
  • Ketan Kotecha,
  • Deepali Vora

Journal volume & issue
Vol. 12
p. 102737

Abstract

Read online

In the digital age, the proliferation of health-related information online has heightened the risk of misinformation, posing substantial threats to public well-being. This research conducts a meticulous comparative analysis of classification models, focusing on detecting health misinformation. The study evaluates the performance of traditional machine learning models and advanced graph convolutional networks (GCN) across critical algorithmic metrics. The results comprehensively understand each algorithm's effectiveness in identifying health misinformation and provide valuable insights for combating the pervasive spread of false health information in the digital landscape. GCN with TF-IDF gives the best result, as shown in the result section. • The research method involves a comparative analysis of classification algorithms to detect health misinformation, exploring traditional machine learning models and graph convolutional networks. • This research used algorithms such as Passive Aggressive Classifier, Random Forest, Decision Tree, Logistic Regression, Light GBM, GCN, GCN with BERT, GCN with TF-IDF, and GCN with Word2Vec were employed. Performance Metrics: Accuracy: for Passive Aggressive Classifier: 85.75 %, Random Forest: 86 %, Decision Tree: 81.30 %, Light BGM: 83.29 %, normal GCN: 84.53 %, GCN with BERT: 85.00 %, GCN with TR-IDF: 93.86 % and GCN with word2Vec: 81.00 % • Algorithmic performance metrics, including accuracy, precision, recall, and F1-score, were systematically evaluated to assess the efficacy of each model in detecting health misinformation, focusing on understanding the strengths and limitations of different approaches. The superior performance of Graph Convolutional Networks (GCNs) with TF-IDF embedding, achieving an accuracy of 93.86 %

Keywords