Measurement: Sensors (Apr 2024)

Fuzzy similarity based hierarchical clustering for communities in twitter social networks

  • R. Suganthi,
  • K. Prabha

Journal volume & issue
Vol. 32
p. 101033

Abstract

Read online

The complexity of the Internet has significantly increased along with its expansions. In particular, social networking has resulted in user clusters pertaining to many communities at different levels. Studies of online communities are a growing area of research user preferences can be categorized including dangerous groups. This work uses pre-processing techniques including stemming, stop words removals, and tokenizations of data using unigram, bigram, and 1–3 g representations. TF-IDFs (Term Frequency-Inverse Document Frequencies) and Word Embeds are used in feature Extractions. Perspectives generate hierarchical data structures which are then combined using consensus matrices followed by computations of dissimilarities between observed sets. This work employs FS-HC (fuzzy similarity based Hierarchical Clustering) to generate dendrograms for views. Finally, consensus matrices are generated by integrating several hierarchical agglomerations using transitive consensus matrix generations. They contain representative data of generated dendrograms. Performance metrics including precisions, recalls, f-measures, accuracies, clustering coefficients, conductance, and contractions are used to evaluate outcomes of benchmarked clustering algorithms. The proposed approach attains higher accuracy value of about 92 % when compared to other existing algorithms. The required data for community discoveries in social media were gathered from Twitter.

Keywords