Journal of Applied Mathematics (Jan 2022)
Classification Algorithm for Heterogeneous Network Data Streams Based on Big Data Active Learning
Abstract
Data classification is one of the main tasks in the current data mining field, and the existing network data triage algorithms have problems such as too small a proportion of labeled samples, a large amount of noise, and redundant data, which lead to low classification accuracy of data stream implementation. Network embedding can effectively improve these problems, but the network embedding itself has problems such as capturing relational honor and ambiguity. This study proposes a SNN-RODE based LapRLS heterogeneous network data classification algorithm to achieve deep embedding of structure and semantics among nodes by constructing a multitask SNN and selecting dead song datasets to perform mining tasks to train the neural network. Then a semisupervised learning classifier based on Laplace regular least squares regression model is designed to use the relative support difference function as the decision method and optimize the function. The simulation experimental results show that the SNN-RODE-LapRLS algorithm improves the performance by 14%-51% over the mainstream classification algorithms, and the consumption time meets the demand of real-time classification.