Journal of King Saud University: Computer and Information Sciences (Sep 2022)
Providing contexts for classification of transients in a wide-area sky survey: An application of noise-induced cluster ensemble
Abstract
With new sensor systems that capture sky survey at high quality level, analyzing the resulting data within a limited time frame appears to be the next challenge. Specific to the GOTO project, this task proves to be crucial to discover new transients from a pool of large candidates. Initial works based on the feature-based approach design this detection as imbalance classification, where a data-level method can be used to resolve the difference in cardinality between classes. This paper presents a context generation framework to complement the previously proposed model. In particular, samples are clustered to form data contexts to which different learning strategies may be applied. To ensure the quality of data clustering, a noise-induced cluster ensemble technique that has been recently introduced in the literature is employed here. The results with simulated data and algorithms of NB, C4.5 and KNN have shown that the proposed framework can filter out some negative samples quickly, while making classification of the rest more effective. In particular, it enhances predictive performance of basic classifiers by lifting F1 scores from less than 0.1 to around 0.3–0.5. Besides, parameter analysis is also given as a guideline for its application.