Informatics in Medicine Unlocked (Jan 2023)
Not one size fits all: Influence of EEG type when training a deep neural network for interictal epileptiform discharge detection
Abstract
Objective: Deep learning methods have shown potential in automating interictal epileptiform discharge (IED) detection in electroencephalograms (EEGs). While it is known that these algorithms are dependent on the type of data used for training, this has not been explicitly explored in EEG analysis applications. We study the difference in performance of deep learning algorithms on routine and ambulatory EEGs. Methods: For training we used three datasets: i) 166 routine EEGs; ii) 75 ambulatory EEGs and iii) a combination of the two data types (241 EEGs). Routine EEGs were recorded in the hospital, ambulatory EEGs in the home environment, and included sleep. We trained a deep neural network (VGGC), on all three datasets, resulting in three deep nets, a VGGC-R for the routine EEGs, a VGGC-A for the ambulatory EEGs and a net that was trained on all data, VGGC-C. All three networks were subsequently tested using a test set that was comprised of 34 routine EEGs and 33 ambulatory recordings. For the evaluation, all 2 s non-overlapping epochs were labeled with a probability that expressed the likelihood of containing an epileptiform discharge. Performance was quantified as sensitivity, specificity and the rate of false positive detections (FPR). Results: The VGGC-R, had the best performance for routine EEGs, with 84% sensitivity at 99% specificity, however the sensitivity of this model was only 53% on ambulatory EEGs, with a specificity of 95% and FPR >3 FP/min. The networks that had been trained using only ambulatory data or all data, the VGGC-A and VGGC-C, yielded sensitivities in the test set comprised of ambulatory data of 60% and 79%, respectively, at 99% specificity, with a FPR of <0.4 FPs/min. When tested on the routine EEGs, the sensitivity was less than 60%. Conclusion: Performance of deep nets for IED detection depends critically on the type of recording used for training. The VGGC-R should be used for routine recordings, the VGGC-C for ambulatory recordings. Significance: The type of data used to train algorithms should be optimized according to their application, as this has a significant impact on performance.