Genome Sequencing Center, HudsonAlpha Institute for Biotechnology, Huntsville, United States; Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, United States
Genome Sequencing Center, HudsonAlpha Institute for Biotechnology, Huntsville, United States; Department of Crop, Soil, and Environmental Sciences, Auburn University, Auburn, United States
Genome Sequencing Center, HudsonAlpha Institute for Biotechnology, Huntsville, United States; Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, United States
The development of multiple chromosome-scale reference genome sequences in many taxonomic groups has yielded a high-resolution view of the patterns and processes of molecular evolution. Nonetheless, leveraging information across multiple genomes remains a significant challenge in nearly all eukaryotic systems. These challenges range from studying the evolution of chromosome structure, to finding candidate genes for quantitative trait loci, to testing hypotheses about speciation and adaptation. Here, we present GENESPACE, which addresses these challenges by integrating conserved gene order and orthology to define the expected physical position of all genes across multiple genomes. We demonstrate this utility by dissecting presence–absence, copy-number, and structural variation at three levels of biological organization: spanning 300 million years of vertebrate sex chromosome evolution, across the diversity of the Poaceae (grass) plant family, and among 26 maize cultivars. The methods to build and visualize syntenic orthology in the GENESPACE R package offer a significant addition to existing gene family and synteny programs, especially in polyploid, outbred, and other complex genomes.