Nature Communications (Oct 2024)
The DoGA consortium expression atlas of promoters and genes in 100 canine tissues
Abstract
Abstract The dog, Canis lupus familiaris, is an important model for studying human diseases. Unlike many model organisms, the dog genome has a comparatively poor functional annotation, which hampers gene discovery for development, morphology, disease, and behavior. To fill this gap, we established a comprehensive tissue biobank for both the dog and wolf samples. The biobank consists of 5485 samples representing 132 tissues from 13 dogs, 12 dog embryos, and 24 wolves. In a subset of 100 tissues from nine dogs and 12 embryos, we characterized gene expression activity for each promoter, including alternative and novel, i.e., previously not annotated, promoter regions, using the 5’ targeting RNA sequencing technology STRT2-seq. We identified over 100,000 promoter region candidates in the recent canine genome assembly, CanFam4, including over 45,000 highly reproducible sites with gene expression and respective tissue enrichment levels. We provide a promoter and gene expression atlas with interactive, open data resources, including a data coordination center and genome browser track hubs. We demonstrated the applicability of Dog Genome Annotation (DoGA) data and resources using multiple examples spanning canine embryonic development, morphology and behavior, and diseases across species.