Egyptian Informatics Journal (Sep 2024)
iLDA: A new dimensional reduction method for non-Gaussian and small sample size datasets
Abstract
High-dimensional non-Gaussian data is widely found in the real world, such as in face recognition, facial expressions, document recognition, and text processing. Linear discriminant analysis (LDA) as dimensionality reduction performs poorly on non-Gaussian data and fails on high-dimensional data when the number of features is greater than the number of instances, commonly referred to as a small sample size (SSS) problem. We proposed a new method to reduce the number of dimensions called iterative LDA (iLDA). This method will handle the iterative use of LDA by gradually extracting features until the best separability is reached. The proposed method produces better vector projections than LDA for Gaussian and non-Gaussian data and avoids the singularity problem in high-dimensional data. Running LDA does not necessarily increase the excessive computational cost caused by calculating eigenvectors since the eigenvectors are calculated from small-dimensional matrices. The experimental results show performance improvement on 8 out of 10 small-dimensional datasets, and the best improvement occurs on the ULC dataset, from 0.753 to 0.861. For image datasets, accuracy improved in all datasets, with the Chest CT-Scan dataset showing the greatest improvement, followed by Georgia Tech from 0.6044 to 0.8384 and 0.8883 to 0.9481, respectively.