IEEE Access (Jan 2022)

Estimating the Bot Population on Twitter via Random Walk Based Sampling

  • Mei Fukuda,
  • Kazuki Nakajima,
  • Kazuyuki Shudo

DOI
https://doi.org/10.1109/ACCESS.2022.3149887
Journal volume & issue
Vol. 10
pp. 17201 – 17211

Abstract

Read online

The rise of social bots, which contribute to marketing, political intervention, and the spread of fake news, has been noted. Analysis methods for the characteristics of Twitter bots have been developed for third-party researchers who have access limitations to Twitter data. Here, we propose a method for estimating the bot population on Twitter based on a random walk. The proposed method addresses two major problems in estimating the bot population on Twitter based on a random walk. First, the maximum number of retrievable friends or followers of a user per query is limited. Second, there is a certain percentage of private users who do not publish personal content, e.g., friends, followers, and tweets. We conduct a simulation analysis using directed social graph datasets to validate whether the proposed estimator is effective on the real Twitter follow graph. Then, we present three different estimates of the bot population on Twitter using the proposed estimator based on the three sample sequences of 25,000 users collected in 2.5 weeks each. The three estimates consistently suggest that 8%–18% of Twitter users during April–June 2021 are bots.

Keywords