Frontiers in Systems Neuroscience (Nov 2012)
Kernel Principal Component Analysis for dimensionality reduction in fMRI-based diagnosis of ADHD
Abstract
This article explores various preprocessing tools that select/create features to help a learner produce a classifier that can use fMRI data to effectively discriminate Attention-Deficit Hyperactivity Disorder (ADHD) patients from healthy controls. We consider four different learning tasks: predicting either two (ADHD vs control) or three classes (ADHD-1 vs ADHD-3 vs control), where each use either the imaging data only, or the phenotypic and imaging data. After averaging, BOLD-signal normalization, and masking of the fMRI images, we considered applying Fast Fourier Transform (FFT), possibly followed by some Principal Component Analysis (PCA) variant (over time: PCA-t; over space and time: PCA-st or the kernelized variant, kPCA-st), to produce inputs to a learner, to determine which learned classifier performs the best – or at least better than the baseline of 64.2%, which is the proportion of the majority class (here, controls).In the two-class setting, PCA-t and PCA-st did not perform statistically better than baseline, whereas FFT and kPCA-st did (FFT, 68.4%; kPCA-st, 70.3%); when combined with the phenotypic data, which by itself produces 72.9% accuracy, all methods performed statistically better than the baseline, but none did better than using the phenotypic data. In the three-class setting, neither the PCA variants, or the phenotypic data classifiers, performed statistically better than the baseline.We next used the FFT output as input to the PCA variants. In the two-class setting, the PCA variants performed statistically better than the baseline using either the FFTed waveforms only (FFT+PCA-t, 69.6%,; FFT+PCA-st, 69.3% ; FFT+kPCA-st, 68.7%), or combining them with the phenotypic data (FFT+PCA-t, 70.6%; FFT+PCA-st, 70.6%; kPCA-st, 76%). In both settings, combining FFT+kPCA-st’s features with the phenotypic data was better than using only the phenotypic data, with the result in the two-class setting being statistically better.
Keywords