IEEE Access (Jan 2023)

Unveiling Cryptocurrency Conversations: Insights From Data Mining and Unsupervised Learning Across Multiple Platforms

  • Hae Sun Jung,
  • Haein Lee,
  • Jang Hyun Kim

DOI
https://doi.org/10.1109/ACCESS.2023.3334617
Journal volume & issue
Vol. 11
pp. 130573 – 130583

Abstract

Read online

The rapid growth of the cryptocurrency market has led to an increasing interest in the subject. Cryptocurrency is now recognized as an asset, and laws and financial regulations have begun to emerge for supporting its practical use. As a result, it has become essential to perform data mining and attain knowledge from text data related to cryptocurrency. Previous studies have focused on analyzing data from a single source such as Twitter. However, there are unique insights to be gained from data across multiple platforms. In the present study, we utilized data mining techniques to extract insights from LexisNexis, Web of Science, and Reddit, representing the media, academia, and general public, respectively. Among unsupervised learning technologies, topic modeling was employed for the analysis. Topic modeling is a methodology that uncovers hidden meanings within the collected data. Among the diverse topic modeling techniques available, bidirectional encoder representations from transformers topic was chosen for the analysis. BERTopic considered to be state-of-the-art in the field of topic modeling. Dynamic topic modeling was employed to track changes in themes over time. Our experimental results reveal a tendency in the news to cover major events related to cryptocurrencies, such as regulatory developments and market trends. Academic papers, on the other hand, tend to focus on the technology behind cryptocurrencies and related research. Finally, social media conversations center more around information delivery from an investor’s psychological perspective, such as market sentiment and investment strategies.

Keywords