Scientific Data (Aug 2023)

High-quality wild barley genome assemblies and annotation with Nanopore long reads and Hi-C sequencing data

  • Rui Pan,
  • Haifei Hu,
  • Yuhui Xiao,
  • Le Xu,
  • Yanhao Xu,
  • Kai Ouyang,
  • Chengdao Li,
  • Tianhua He,
  • Wenying Zhang

DOI
https://doi.org/10.1038/s41597-023-02434-2
Journal volume & issue
Vol. 10, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Wild barley, from “Evolution Canyon (EC)” in Mount Carmel, Israel, are ideal models for cereal chromosome evolution studies. Here, the wild barley EC_S1 is from the south slope with higher daily temperatures and drought, while EC_N1 is from the north slope with a cooler climate and higher relative humidity, which results in a differentiated selection due to contrasting environments. We assembled a 5.03 Gb genome with contig N50 of 3.53 Mb for wild barley EC_S1 and a 5.05 Gb genome with contig N50 of 3.45 Mb for EC_N1 using 145 Gb and 160.0 Gb Illumina sequencing data, 295.6 Gb and 285.35 Gb Nanopore sequencing data and 555.1 Gb and 514.5 Gb Hi-C sequencing data, respectively. BUSCOs and CEGMA evaluation suggested highly complete assemblies. Using full-length transcriptome data, we predicted 39,179 and 38,373 high-confidence genes in EC_S1 and EC_N1, in which 93.6% and 95.2% were functionally annotated, respectively. We annotated repetitive elements and non-coding RNAs. These two wild barley genome assemblies will provide a rich gene pool for domesticated barley.