Human Genome Variation (Jun 2024)

Identifying unstable CNG repeat loci in the human genome: a heuristic approach and implications for neurological disorders

  • Varun Suroliya,
  • Bharathram Uppili,
  • Manish Kumar,
  • Vineet Jha,
  • Achal K. Srivastava,
  • Mohammed Faruq

DOI
https://doi.org/10.1038/s41439-024-00281-0
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 8

Abstract

Read online

Abstract Tandem nucleotide repeat (TNR) expansions, particularly the CNG nucleotide configuration, are associated with a variety of neurodegenerative disorders. In this study, we aimed to identify novel unstable CNG repeat loci associated with the neurogenetic disorder spinocerebellar ataxia (SCA). Using a computational approach, 15,069 CNG repeat loci in the coding and noncoding regions of the human genome were identified. Based on the feature selection criteria (repeat length >10 and functional location of repeats), we selected 52 repeats for further analysis and evaluated the repeat length variability in 100 control subjects. A subset of 19 CNG loci observed to be highly variable in control subjects was selected for subsequent analysis in 100 individuals with SCA. The genes with these highly variable repeats also exhibited higher gene expression levels in the brain according to the tissue expression dataset (GTEx). No pathogenic expansion events were identified in patient samples, which is a limitation given the size of the patient group examined; however, these loci contain potential risk alleles for expandability. Recent studies have implicated GLS, RAI1, GIPC1, MED15, EP400, MEF2A, and CNKSR2 in neurological diseases, with GLS, GIPC1, MED15, RAI1, and MEF2A sharing the same repeat loci reported in this study. This finding validates the approach of evaluating repeat loci in different populations and their possible implications for human pathologies.