Tehnički Vjesnik (Jan 2024)
Grad-CAM-Based Feature Selection and Dementia Classification Algorithm Using Voice Data
Abstract
This study presents a unique methodology for dementia classification that harnesses voice data and integrates transfer learning, feature selection, and attention-based visualization. We examined two deep learning input techniques: one consolidating three Melspectrograms (standard, Harmonic/Percussive average, and delta value) into an integrated image and the other assessing them individually.This study validates the efficacy of melt spectrogram image classification using Gradient-weighted Class Activation Mapping (Grad-CAM) for feature selection. This study exploited the Grad-CAM attention map to pinpoint the Melspectrogram's most impactful features. Evaluations illustrated that the combined synthetic images yielded 1.4% - 9.4% better accuracy than the separate images. Implementing Grad-CAM for feature selection further amplifies accuracy. Models utilizing features identified by Grad-CAM averaged a 4.3% superior accuracy compared with solely fine-tuned models. With integrated mel-spectrograms as input, the classification accuracies for Normal vs. Dementia, Normal vs. Mild Cognitive Impairment, and Dementia vs. Mild Cognitive Impairment were 75%, 67.9%, and 67.8%, respectively, indicating an improvement of up to 13.1% compared to individual images.
Keywords