PLoS Biology (Feb 2005)

The Genomes of Oryza sativa: a history of duplications.

  • Jun Yu,
  • Jun Wang,
  • Wei Lin,
  • Songgang Li,
  • Heng Li,
  • Jun Zhou,
  • Peixiang Ni,
  • Wei Dong,
  • Songnian Hu,
  • Changqing Zeng,
  • Jianguo Zhang,
  • Yong Zhang,
  • Ruiqiang Li,
  • Zuyuan Xu,
  • Shengting Li,
  • Xianran Li,
  • Hongkun Zheng,
  • Lijuan Cong,
  • Liang Lin,
  • Jianning Yin,
  • Jianing Geng,
  • Guangyuan Li,
  • Jianping Shi,
  • Juan Liu,
  • Hong Lv,
  • Jun Li,
  • Jing Wang,
  • Yajun Deng,
  • Longhua Ran,
  • Xiaoli Shi,
  • Xiyin Wang,
  • Qingfa Wu,
  • Changfeng Li,
  • Xiaoyu Ren,
  • Jingqiang Wang,
  • Xiaoling Wang,
  • Dawei Li,
  • Dongyuan Liu,
  • Xiaowei Zhang,
  • Zhendong Ji,
  • Wenming Zhao,
  • Yongqiao Sun,
  • Zhenpeng Zhang,
  • Jingyue Bao,
  • Yujun Han,
  • Lingli Dong,
  • Jia Ji,
  • Peng Chen,
  • Shuming Wu,
  • Jinsong Liu,
  • Ying Xiao,
  • Dongbo Bu,
  • Jianlong Tan,
  • Li Yang,
  • Chen Ye,
  • Jingfen Zhang,
  • Jingyi Xu,
  • Yan Zhou,
  • Yingpu Yu,
  • Bing Zhang,
  • Shulin Zhuang,
  • Haibin Wei,
  • Bin Liu,
  • Meng Lei,
  • Hong Yu,
  • Yuanzhe Li,
  • Hao Xu,
  • Shulin Wei,
  • Ximiao He,
  • Lijun Fang,
  • Zengjin Zhang,
  • Yunze Zhang,
  • Xiangang Huang,
  • Zhixi Su,
  • Wei Tong,
  • Jinhong Li,
  • Zongzhong Tong,
  • Shuangli Li,
  • Jia Ye,
  • Lishun Wang,
  • Lin Fang,
  • Tingting Lei,
  • Chen Chen,
  • Huan Chen,
  • Zhao Xu,
  • Haihong Li,
  • Haiyan Huang,
  • Feng Zhang,
  • Huayong Xu,
  • Na Li,
  • Caifeng Zhao,
  • Shuting Li,
  • Lijun Dong,
  • Yanqing Huang,
  • Long Li,
  • Yan Xi,
  • Qiuhui Qi,
  • Wenjie Li,
  • Bo Zhang,
  • Wei Hu,
  • Yanling Zhang,
  • Xiangjun Tian,
  • Yongzhi Jiao,
  • Xiaohu Liang,
  • Jiao Jin,
  • Lei Gao,
  • Weimou Zheng,
  • Bailin Hao,
  • Siqi Liu,
  • Wen Wang,
  • Longping Yuan,
  • Mengliang Cao,
  • Jason McDermott,
  • Ram Samudrala,
  • Jian Wang,
  • Gane Ka-Shu Wong,
  • Huanming Yang

DOI
https://doi.org/10.1371/journal.pbio.0030038
Journal volume & issue
Vol. 3, no. 2
p. e38

Abstract

Read online

We report improved whole-genome shotgun sequences for the genomes of indica and japonica rice, both with multimegabase contiguity, or almost 1,000-fold improvement over the drafts of 2002. Tested against a nonredundant collection of 19,079 full-length cDNAs, 97.7% of the genes are aligned, without fragmentation, to the mapped super-scaffolds of one or the other genome. We introduce a gene identification procedure for plants that does not rely on similarity to known genes to remove erroneous predictions resulting from transposable elements. Using the available EST data to adjust for residual errors in the predictions, the estimated gene count is at least 38,000-40,000. Only 2%-3% of the genes are unique to any one subspecies, comparable to the amount of sequence that might still be missing. Despite this lack of variation in gene content, there is enormous variation in the intergenic regions. At least a quarter of the two sequences could not be aligned, and where they could be aligned, single nucleotide polymorphism (SNP) rates varied from as little as 3.0 SNP/kb in the coding regions to 27.6 SNP/kb in the transposable elements. A more inclusive new approach for analyzing duplication history is introduced here. It reveals an ancient whole-genome duplication, a recent segmental duplication on Chromosomes 11 and 12, and massive ongoing individual gene duplications. We find 18 distinct pairs of duplicated segments that cover 65.7% of the genome; 17 of these pairs date back to a common time before the divergence of the grasses. More important, ongoing individual gene duplications provide a never-ending source of raw material for gene genesis and are major contributors to the differences between members of the grass family.