On Subsampling Procedures for Support Vector Machines

Roberto Bárcenas; Maria Gonzalez-Lima; Joaquin Ortega; Adolfo Quiroz

doi:10.3390/math10203776

Mathematics (Oct 2022)

On Subsampling Procedures for Support Vector Machines

Roberto Bárcenas,
Maria Gonzalez-Lima,
Joaquin Ortega,
Adolfo Quiroz

Affiliations

Roberto Bárcenas: Facultad de Ciencias, Universidad Nacional Autónoma de México, Ciudad de Mexico 04510, Mexico
Maria Gonzalez-Lima: Departamento Matemáticas y Estadística, Universidad del Norte, Barranquilla 080001, Colombia
Joaquin Ortega: CEMSE, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia
Adolfo Quiroz: Departamento de Matemáticas, Universidad de los Andes, Bogota 111711, Colombia

DOI: https://doi.org/10.3390/math10203776
Journal volume & issue: Vol. 10, no. 20
p. 3776

Abstract

Read online

Herein, theoretical results are presented to provide insights into the effectiveness of subsampling methods in reducing the amount of instances required in the training stage when applying support vector machines (SVMs) for classification in big data scenarios. Our main theorem states that under some conditions, there exists, with high probability, a feasible solution to the SVM problem for a randomly chosen training subsample, with the corresponding classifier as close as desired (in terms of classification error) to the classifier obtained from training with the complete dataset. The main theorem also reflects the curse of dimensionalityin that the assumptions made for the results are much more restrictive in large dimensions; thus, subsampling methods will perform better in lower dimensions. Additionally, we propose an importance sampling and bagging subsampling method that expands the nearest-neighbors ideas presented in previous work. Using different benchmark examples, the method proposed herein presents a faster solution to the SVM problem (without significant loss in accuracy) compared with the available state-of-the-art techniques.

Published in Mathematics

ISSN: 2227-7390 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/mathematics

About the journal

Abstract

Keywords