Scientific Data (Apr 2024)

Three de novo assembled wild cacao genomes from the Upper Amazon

  • Orestis Nousias,
  • Jinfang Zheng,
  • Tang Li,
  • Lyndel W. Meinhardt,
  • Bryan Bailey,
  • Osman Gutierrez,
  • Indrani K. Baruah,
  • Stephen P. Cohen,
  • Dapeng Zhang,
  • Yanbin Yin

DOI
https://doi.org/10.1038/s41597-024-03215-1
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 9

Abstract

Read online

Abstract Theobroma cacao, the chocolate tree, is indigenous to the Amazon basin, the greatest biodiversity hotspot on earth. Recent advancement in plant genomics highlights the importance of de novo sequencing of multiple reference genomes to capture the genome diversity present in different cacao populations. In this study, three high-quality chromosome-level genomes of wild cacao were constructed, de novo assembled with HiFi long reads sequencing, and scaffolded using a reference-free strategy. These genomes represent the three most important genetic clusters of cacao trees from the Upper Amazon region. The three wild cacao genomes were compared with two reference genomes of domesticated cacao. The five cacao genetic clusters were inferred to have diverged in the early and middle Pleistocene period, approximately 1.83–0.69 million years ago. The results shown here serve as an example of understanding how the Amazonian biodiversity was developed. The three wild cacao genomes provide valuable resources for studying genetic diversity and advancing genetic improvement of this species.