Journal of Sport and Health Science (Dec 2025)
GEPREP: A comprehensive data atlas of RNA-seq-based gene expression profiles of exercise responses
Abstract
Background: Physical activity can regulate and affect gene expression in multiple tissues and cells. Recently, with the development of next-generation sequencing, a large number of RNA-sequencing (RNA-seq)-based gene expression profiles about physical activity have been shared in public resources; however, they are poorly curated and underutilized. To tackle this problem, we developed a data atlas of such data through comprehensive data collection, curation, and organization. Methods: The data atlas, termed gene expression profiles of RNA-seq-based exercise responses (GEPREP), was built on a comprehensive collection of high-quality RNA-seq data on exercise responses. The metadata of each sample were manually curated. Data were uniformly processed and batch effects corrected. All the information was well organized in an easy-to-use website for free search, visualization, and download. Results: GEPREP now includes 69 RNA-seq datasets of pre- and post-exercise, comprising 26 human datasets (1120 samples) and 43 mouse datasets (1006 samples). Specifically, there were 977 (87.2 %) human samples of skeletal muscle and 143 (12.8 %) human samples of blood. There were also samples across 9 mice tissues with skeletal muscle (359, 35.7 %) and brain (280, 27.8 %) accounting for the main fractions. Metadata—including subject, exercise interventions, sampling sites, and post-processing methods—are also included. The metadata and gene expression profiles are freely accessible at http://www.geprep.org.cn/. Conclusion: GEPREP is a comprehensive data atlas of RNA-seq-based gene expression profiles responding to exercise. With its reliable annotations and user-friendly interfaces, it has the potential to deepen our understanding of exercise physiology.