网络与信息安全学报 (Aug 2023)

Construction of multi-modal social media dataset for fake news detection

  • Guopeng GAO, Yaodong FANG, Yanfang HAN, Zhenxing QIAN, Chuan QIN

DOI
https://doi.org/10.11959/j.issn.2096-109x.2023060
Journal volume & issue
Vol. 9, no. 4
pp. 144 – 154

Abstract

Read online

The advent of social media has brought about significant changes in people’s lives.While social media allows for easy access and sharing of news, it has also become a breeding ground for the dissemination of fake news, posing a serious threat to social security and stability.Consequently, researchers have shifted their focus towards fake news detection.Although several deep learning-based solutions have been proposed, these methods heavily rely on large amounts of supporting data.Currently, there is a scarcity of existing datasets, particularly in Chinese, and the collected news articles are often limited to the same category.To enhance the detection of fake news, a new multi-modal fake news dataset (MFND) was developed, which comprised Chinese and English news data from ten diverse categories: politics, economy, entertainment, sports, international affairs, technology, military, education, health, and social life.The word frequencies and categories of the proposed fake news dataset were analyzed and compared with existing fake news datasets in terms of number of news, news categories, modal information and news languages.The results of the comparison demonstrate that the MFND dataset excels in terms of category information and news languages.Moreover, training and validating existing typical fake news detection methods with MFND dataset, the experimental results show an improvement of approximately 10% in model performance compared to existing mainstream fake news datasets.

Keywords