Journal of Intelligent Systems (May 2019)

Design and Evaluation of Outlier Detection Based on Semantic Condensed Nearest Neighbor

  • Batchanaboyina M. Rao,
  • Devarakonda Nagaraju

DOI
https://doi.org/10.1515/jisys-2018-0476
Journal volume & issue
Vol. 29, no. 1
pp. 1416 – 1424

Abstract

Read online

Social media contain abundant information about the events or news occurring all over the world. Social media growth has a greater impact on various domains like marketing, e-commerce, health care, e-governance, and politics, etc. Currently, Twitter was developed as one of the social media platforms, and now, it is one of the most popular social media platforms. There are 1 billion user’s profiles and millions of active users, who post tweets daily. In this research, buzz detection in social media was carried out by the semantic approach using the condensed nearest neighbor (SACNN). The Twitter and Tom’s Hardware data are stored in the UC Irvine Machine Learning Repository, and this dataset is used in this research for outlier detection. The min–max normalization technique is applied to the social media dataset, and additionally, missing values were replaced by the normalized value. The condensed nearest neighbor (CNN) is used for semantic analysis of the database, and based on the optimized value provided by the proposed method, the threshold is calculated. The threshold value is used to classify buzz and non-buzz discussions in the social media database. The result showed that the SACNN achieved 99% of accuracy, and relative error is less than the existing methods.

Keywords