Genes (Jan 2023)
Evolutionary Landscape of <i>SOX</i> Genes to Inform Genotype-to-Phenotype Relationships
- Adam Underwood,
- Daniel T Rasicci,
- David Hinds,
- Jackson T Mitchell,
- Jacob K Zieba,
- Joshua Mills,
- Nicholas E Arnold,
- Taylor W Cook,
- Mehdi Moustaqil,
- Yann Gambin,
- Emma Sierecki,
- Frank Fontaine,
- Sophie Vanderweele,
- Akansha S Das,
- William Cvammen,
- Olivia Sirpilla,
- Xavier Soehnlen,
- Kristen Bricker,
- Maram Alokaili,
- Morgan Green,
- Sadie Heeringa,
- Amy M Wilstermann,
- Thomas M. Freeland,
- Dinah Qutob,
- Amy Milsted,
- Ralf Jauch,
- Timothy J Triche,
- Connie M Krawczyk,
- Caleb P Bupp,
- Surender Rajasekaran,
- Mathias Francois,
- Jeremy W. Prokop
Affiliations
- Adam Underwood
- Division of Mathematics and Science, Walsh University, North Canton, OH 44720, USA
- Daniel T Rasicci
- Division of Mathematics and Science, Walsh University, North Canton, OH 44720, USA
- David Hinds
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
- Jackson T Mitchell
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI 49503, USA
- Jacob K Zieba
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI 49503, USA
- Joshua Mills
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI 49503, USA
- Nicholas E Arnold
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI 49503, USA
- Taylor W Cook
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI 49503, USA
- Mehdi Moustaqil
- Single Molecule Science, Lowy Cancer Research Centre, The University of New South Wales, Sydney, NSW 2031, Australia
- Yann Gambin
- Single Molecule Science, Lowy Cancer Research Centre, The University of New South Wales, Sydney, NSW 2031, Australia
- Emma Sierecki
- Single Molecule Science, Lowy Cancer Research Centre, The University of New South Wales, Sydney, NSW 2031, Australia
- Frank Fontaine
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD 4072, Australia
- Sophie Vanderweele
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI 49503, USA
- Akansha S Das
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI 49503, USA
- William Cvammen
- Division of Mathematics and Science, Walsh University, North Canton, OH 44720, USA
- Olivia Sirpilla
- Division of Mathematics and Science, Walsh University, North Canton, OH 44720, USA
- Xavier Soehnlen
- Division of Mathematics and Science, Walsh University, North Canton, OH 44720, USA
- Kristen Bricker
- Division of Mathematics and Science, Walsh University, North Canton, OH 44720, USA
- Maram Alokaili
- Division of Mathematics and Science, Walsh University, North Canton, OH 44720, USA
- Morgan Green
- Department of Chemistry, Grand Valley State University, Allendale, MI 49401, USA
- Sadie Heeringa
- Department of Biology, Calvin University, Grand Rapids, MI 49546, USA
- Amy M Wilstermann
- Department of Biology, Calvin University, Grand Rapids, MI 49546, USA
- Thomas M. Freeland
- Division of Mathematics and Science, Walsh University, North Canton, OH 44720, USA
- Dinah Qutob
- Division of Mathematics and Science, Walsh University, North Canton, OH 44720, USA
- Amy Milsted
- Division of Mathematics and Science, Walsh University, North Canton, OH 44720, USA
- Ralf Jauch
- School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR 518057, China
- Timothy J Triche
- Center for Epigenetics, Van Andel Research Institute, Grand Rapids, MI 49503, USA
- Connie M Krawczyk
- Department of Metabolism and Nutritional Programming, Van Andel Institute, Grand Rapids, MI 49503, USA
- Caleb P Bupp
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI 49503, USA
- Surender Rajasekaran
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI 49503, USA
- Mathias Francois
- The Centenary Institute, The University of Sydney, Royal Prince Alfred Hospital, Sydney, NSW 2006, Australia
- Jeremy W. Prokop
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI 49503, USA
- DOI
- https://doi.org/10.3390/genes14010222
- Journal volume & issue
-
Vol. 14,
no. 1
p. 222
Abstract
The SOX transcription factor family is pivotal in controlling aspects of development. To identify genotype–phenotype relationships of SOX proteins, we performed a non-biased study of SOX using 1890 open-reading frame and 6667 amino acid sequences in combination with structural dynamics to interpret 3999 gnomAD, 485 ClinVar, 1174 Geno2MP, and 4313 COSMIC human variants. We identified, within the HMG (High Mobility Group)- box, twenty-seven amino acids with changes in multiple SOX proteins annotated to clinical pathologies. These sites were screened through Geno2MP medical phenotypes, revealing novel SOX15 R104G associated with musculature abnormality and SOX8 R159G with intellectual disability. Within gnomAD, SOX18 E137K (rs201931544), found within the HMG box of ~0.8% of Latinx individuals, is associated with seizures and neurological complications, potentially through blood–brain barrier alterations. A total of 56 highly conserved variants were found at sites outside the HMG-box, including several within the SOX2 HMG-box-flanking region with neurological associations, several in the SOX9 dimerization region associated with Campomelic Dysplasia, SOX14 K88R (rs199932938) flanking the HMG box associated with cardiovascular complications within European populations, and SOX7 A379V (rs143587868) within an SOXF conserved far C-terminal domain heterozygous in 0.716% of African individuals with associated eye phenotypes. This SOX data compilation builds a robust genotype-to-phenotype association for a gene family through more robust ortholog data integration.
Keywords