Revista de Matemática: Teoría y Aplicaciones (Mar 2011)
Curvas ROC y Vecinos Cercanos, Propuesta de un Nuevo Algoritmo de Condensación
Abstract
k-NN criteria are non parametric methods of statistical classificaction. They are accurate, versatile and distribution free. However, their computational cost may be too expensive; especially for large sample sizes. We present a new condensation algorithm based on the Binormal model for ROC curves. It transforms the training sample into a small set of low dimensional vetors. Contrasting with other condensation techniques described in the literature, our proposal helps to control the exchange of accuracy for condensation on the training sample. The results of a Monte Carlo study show that its performance can be very competitive in different realistic scenarios, resulting in better training samples than other frequently used methods.