IEEE Access (Jan 2024)
Concept Drift Detection Based on Typicality and Eccentricity
Abstract
Many applications and fields produce a vast quantity of time-relevant or continuously changing data which may represent new phenomena. This data stream behavior is known as Concept Drift. The need to efficiently and accurately process online data streams is a current need in many areas. Concept drift is a cause of performance degradation of classical machine learning approaches. It is necessary to address the concept drift to deploy real-world applications fed by data streams. This work presents a perspective of Concept Drift Detector (CDD) application to empower a data stream classifier in a real-world scenario followed by the proposal of Concept Drift Detector based on Typicality and Eccentricity Data Analytics (TEDA-CDD). Our method employs two models in monitoring the data stream in order to keep the information of a previous concept whereas monitoring the emergence of a new concept. The models are considered to represent two distinct concepts when the intersection of data samples are significantly low, described by the Jaccard Index. TEDA-CDD is compared to known methods from literature in experiments using synthetic and real-world datasets simulating real-world applications. In these experiments, TEDA-CDD performs comparably in terms of accuracy against well-established algorithms whereas presenting higher memory efficiency.
Keywords