Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) (Dec 2024)
The Impact of Feature Extraction in Random Forest Classifier for Fake News Detection
Abstract
The pervasive issue of fake news spreading rapidly on online platforms. causing a concerning dissemination of misinformation. The influence of fake news has become a pressing social problem, shaping public opinion in important events such as elections. This research focuses on detecting and classifying fake news using the Random Forest algorithm by investigating the impact of feature extraction techniques on classification accuracy, this study specifically employs the TF-IDF method. For this purpose, we used 44,898 English-language articles from the ISOT fake news dataset. The dataset is cleaned using tokenization and stemming then split into 75% training and 25% testing. The TF-IDF vectorizer technique was applied to convert text into numeric as feature extraction. This study has implemented a Random Forest classifier to predict real and fake news. The proposed model contributes to overall classification precision by comparing it to the existing models. This fake news detection highlights the efficacy of the TF-IDF vectorizer and Random Forest combination which achieved an impressive accuracy rate of 99.0%. This contribution highlights an effective strategy for combating misinformation through precise text classification.
Keywords