SNPs, short tandem repeats, and structural variants are responsible for differential gene expression across C57BL/6 and C57BL/10 substrains
Milad Mortazavi,
Yangsu Ren,
Shubham Saini,
Danny Antaki,
Celine L. St. Pierre,
April Williams,
Abhishek Sohni,
Miles F. Wilkinson,
Melissa Gymrek,
Jonathan Sebat,
Abraham A. Palmer
Affiliations
Milad Mortazavi
Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
Yangsu Ren
Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
Shubham Saini
Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
Danny Antaki
Department of Psychiatry, University of California San Diego, La Jolla, CA, USA; Department of Cellular and Molecular Medicine and Pediatrics, University of California San Diego, La Jolla, CA, USA
Celine L. St. Pierre
Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
April Williams
Salk Institute for Biological Studies, La Jolla, CA, USA
Abhishek Sohni
Department of Obstetrics, Gynecology and Reproductive Sciences, University of California San Diego, La Jolla, CA, USA
Miles F. Wilkinson
Department of Obstetrics, Gynecology and Reproductive Sciences, University of California San Diego, La Jolla, CA, USA; Institute for Genomic Medicine, University of California San Diego, La Jolla, CA, USA
Melissa Gymrek
Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA; Institute for Genomic Medicine, University of California San Diego, La Jolla, CA, USA; Department of Medicine, University of California San Diego, La Jolla, CA, USA
Jonathan Sebat
Department of Psychiatry, University of California San Diego, La Jolla, CA, USA; Department of Cellular and Molecular Medicine and Pediatrics, University of California San Diego, La Jolla, CA, USA; Institute for Genomic Medicine, University of California San Diego, La Jolla, CA, USA; Corresponding author
Abraham A. Palmer
Department of Psychiatry, University of California San Diego, La Jolla, CA, USA; Institute for Genomic Medicine, University of California San Diego, La Jolla, CA, USA; Corresponding author
Summary: Mouse substrains are an invaluable model for understanding disease. We compared C57BL/6J, which is the most commonly used inbred mouse strain, with eight C57BL/6 and five C57BL/10 closely related inbred substrains. Whole-genome sequencing and RNA-sequencing analysis yielded 352,631 SNPs, 109,096 indels, 150,344 short tandem repeats (STRs), 3,425 structural variants (SVs), and 2,826 differentially expressed genes (DE genes) among these 14 strains; 312,981 SNPs (89%) distinguished the B6 and B10 lineages. These SNPs were clustered into 28 short segments that are likely due to introgressed haplotypes rather than new mutations. Outside of these introgressed regions, we identified 53 SVs, protein-truncating SNPs, and frameshifting indels that were associated with DE genes. Our results can be used for both forward and reverse genetic approaches and illustrate how introgression and mutational processes give rise to differences among these widely used inbred substrains.