IEEE Access (Jan 2023)
Detecting Low-Credibility Medical Websites Through Semi-Supervised Learning Techniques
Abstract
The rapid spread of biased information and disinformation has intensified in recent times due to the increased reliance on the Internet and social media platforms as primary sources of information. This issue is of particular concern in the fields of medicine and healthcare, given the critical nature of decisions and understandings in these areas. While medical experts can mitigate the repercussions of misinformation by evaluating questionable websites, this approach is time-consuming. Consequently, there is a pressing need to develop software solutions that can automate the detection of misleading information. This paper presents CO-training and Active Learning-based framework for Finding Low-credibility web Addresses in the MEdical field (COAL4FLAME), a novel system designed to analyze health-related websites and identify misinformation. The system integrates results from multiple estimators to reach a comprehensive conclusion. COAL4FLAME uses semi-supervised learning strategies, such as multi-view learning, co-training, and active learning, to address the challenge of limited labeled data, a major strength of the presented proposal. Medical experts have rigorously tested and evaluated the system using a selected set of websites.
Keywords