Vojnotehnički Glasnik (Apr 2023)

Classification and analysis of the MNIST dataset using PCA and SVM algorithms

  • Mokhaled N. A. Al-Hamadani

DOI
https://doi.org/10.5937/vojtehg71-42689
Journal volume & issue
Vol. 71, no. 2
pp. 221 – 238

Abstract

Read online

Introduction/purpose: The utilization of machine learning methods has become indispensable in analyzing large-scale, complex data in contemporary data-driven environments, with a diverse range of applications from optimizing business operations to advancing scientific research. Despite the potential for insight and innovation presented by these voluminous datasets, they pose significant challenges in areas such as data quality and structure, necessitating the implementation of effective management strategies. Machine learning techniques have emerged as essential tools in identifying and mitigating these challenges and developing viable solutions to address them. The MNIST dataset represents a prominent example of a widely-used dataset in this field, renowned for its expansive collection of handwritten numerical digits, and frequently employed in tasks such as classification and analysis, as demonstrated in the present study. Methods: This study employed the MNIST dataset to investigate various statistical techniques, including the Principal Components Analysis (PCA) algorithm implemented using the Python programming language. Additionally, Support Vector Machine (SVM) models were applied to both linear and non-linear classification problems to assess the accuracy of the model. Results: The results of the present study indicate that while the PCA technique is effective for dimensionality reduction, it may not be as effective for visualization purposes. Moreover, the findings demonstrate that both linear and non-linear SVM models were capable of effectively classifying the dataset. Conclusion: The findings of the study demonstrate that SVM can serve as an efficacious technique for addressing classification problems.

Keywords