Reconstructing the ancestral gene pool to uncover the origins and genetic links of Hmong–Mien speakers
Yang Gao,
Xiaoxi Zhang,
Hao Chen,
Yan Lu,
Sen Ma,
Yajun Yang,
Menghan Zhang,
Shuhua Xu
Affiliations
Yang Gao
State Key Laboratory of Genetic Engineering, Human Phenome Institute, Zhangjiang Fudan International Innovation Center, Center for Evolutionary Biology, Department of Liver Surgery and Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University
Xiaoxi Zhang
School of Life Science and Technology, ShanghaiTech University
Hao Chen
Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences
Yan Lu
State Key Laboratory of Genetic Engineering, Human Phenome Institute, Zhangjiang Fudan International Innovation Center, Center for Evolutionary Biology, Department of Liver Surgery and Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University
Sen Ma
Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences
Yajun Yang
Ministry of Education Key Laboratory of Contemporary Anthropology, Collaborative Innovation Center for Genetics and Development, Fudan University
Menghan Zhang
Institute of Modern Languages and Linguistics, and Ministry of Education Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University
Shuhua Xu
State Key Laboratory of Genetic Engineering, Human Phenome Institute, Zhangjiang Fudan International Innovation Center, Center for Evolutionary Biology, Department of Liver Surgery and Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University
Abstract Background Hmong–Mien (HM) speakers are linguistically related and live primarily in China, but little is known about their ancestral origins or the evolutionary mechanism shaping their genomic diversity. In particular, the lack of whole-genome sequencing data on the Yao population has prevented a full investigation of the origins and evolutionary history of HM speakers. As such, their origins are debatable. Results Here, we made a deep sequencing effort of 80 Yao genomes, and our analysis together with 28 East Asian populations and 968 ancient Asian genomes suggested that there is a strong genetic basis for the formation of the HM language family. We estimated that the most recent common ancestor dates to 5800 years ago, while the genetic divergence between the HM and Tai–Kadai speakers was estimated to be 8200 years ago. We proposed that HM speakers originated from the Yangtze River Basin and spread with agricultural civilization. We identified highly differentiated variants between HM and Han Chinese, in particular, a deafness-related missense variant (rs72474224) in the GJB2 gene is in a higher frequency in HM speakers than in others. Conclusions Our results indicated complex gene flow and medically relevant variants involved in the HM speakers’ evolution history.