IEEE Access (Jan 2021)

Predicting Influential Blogger’s by a Novel, Hybrid and Optimized Case Based Reasoning Approach With Balanced Random Forest Using Imbalanced Data

  • Yousra Asim,
  • Ahmad Kamran Malik,
  • Basit Raza,
  • Ahmad R. Shahid,
  • Nafees Qamar

DOI
https://doi.org/10.1109/ACCESS.2020.3048610
Journal volume & issue
Vol. 9
pp. 6836 – 6854

Abstract

Read online

Bloggers possess the capability of understanding and influencing mass psychology to a wide community of fans and followers by posting their online valuable content. Their dominance over audience can be used as a helping hand in the corporate world which desires to disseminate their product or services among diversified people belonging to varying localities, and is always on the lookout for suitable and quick ways to grasp public access. Due to this reason, influential bloggers are preferred in the online market to initiate marketing campaigns which is a thought-provoking task due to loads of blogger communities. The novelty of this paper lies in the proposed Framework for Influential Blogger Prediction based on Blogger and Blog Features (IBP-BBF) using Case-Based Reasoning (CBR) which is not only capable of handling labeled data but also unstructured data (blogs) and imbalanced data in an optimized way. Detailed labelled and unstructured data are collected by online survey of 129 bloggers and text mining of their 32,200 blogs respectively. The classification results are compared and validated with state-of-the-art machine learning techniques by using standard evaluation measures respectively in the context of imbalanced data. The results show that the proposed IBP-BBF framework through CBR modeling outperforms existing techniques in classifying and adapting the influential blogger prediction. The IBP-BBF framework performed better as compared to baseline imbalanced data classification techniques. It is found that the Balanced Random Forest contributes towards the performance of CBR approach than Balanced Bagging Classifier and RUSBoost classifier. By using the CBR approach, baseline techniques can be optimized for influential blogger identification in a better way.

Keywords