Pamukkale University Journal of Engineering Sciences (Oct 2019)
Continuous time threshold selection for binary classification on polarized data
Abstract
Binary classification is used to distinguish some of the data elements from others in a meaningful way according to certain characteristics. Supervised classification techniques often use the ground-truth data, which assists to determine the distinctive characteristics of the elements to be extracted from the data. These techniques also generate new features for all of the data using the current features in accordance with the ground-truth data. One of the purposes of generating new features is to polarize the data elements (to be extracted and others) toward the separate pools on a coordinate axis for binary classification. In this way, the binary classification process is easy using only a threshold value on the axis. In this work, the Linear Discriminant Analysis (LDA) is used to polarize the data and a threshold selection algorithm is proposed, which use the harmonic mean F-score values of the binary classification outputs resulting from some specific threshold values. The key condition in the proposed method is that the most suitable threshold must give the best classification score (F-score value) and other threshold values must give lower classification scores as they become distant from the best threshold value (move away toward the ends of the axis). The proposed method is experimented for binary classifications of some meaningful elements on a remote sensing image taken from a 2D semantic labelling dataset that has the ground-truth images. The proposed method convergences the best threshold value continuously in logarithmic time.