Genetics in Medicine Open (Jan 2024)
Natural language processing and expert follow-up establishes tachycardia association with CDKL5 deficiency disorder
Abstract
Purpose: CDKL5 deficiency disorder (CDD) is a developmental and epileptic encephalopathy with multisystemic comorbidities. Cardiovascular involvement in CDD was shown in animal models but is yet poorly described in CDD cohorts. Methods: We identified 38 individuals with genetically confirmed CDD through the Cleveland Clinic CDD specialty clinic and matched 190 individuals with non-genetic epilepsy to them as a comparison group. Natural language processing was applied to yield Human Phenotype Ontology (HPO) terms from medical records. We conducted HPO association testing and manual chart review to explore cardiovascular comorbidities associated with CDD. Results: We extracted 243,541 HPO terms from 30,512 medical encounters. Phenome-wide analysis confirmed well-established CDD phenotypes and identified association of tachycardia with CDD (Odds ratio 4.2, 95% confidence interval (CI) 1.75-9.93, Padj < .001). We found a 99.6-fold enrichment of supraventricular tachycardia (SVT) in CDD encounter notes (Padj < .001), which led to identification of 2 cases of fetal/neonatal onset SVT previously undescribed in CDD. Tachycardia in CDD individuals was associated with the presence of other autonomic symptoms (Odds ratio 5.63, 95% CI 1.08-40.3, P = .038). Conclusion: CDD is associated with tachycardia, potentially including early-onset SVT. Alongside prospective validation studies, semiautomated genotype-phenotype analysis with matched controls is a scalable, rapid, and efficient approach for validating known and identifying novel phenotype associations.