Frontiers in Genetics (Feb 2022)
The Chromosome-Scale Reference Genome of Macadamia tetraphylla Provides Insights Into Fatty Acid Biosynthesis
Abstract
Macadamia is an evergreen tree belonging to the Proteaceae family. The two commercial macadamia species, Macadamia integrifolia and M. tetraphylla, are highly prized for their edible kernels. The M. integrifolia genome was recently sequenced, but the genome of M. tetraphylla has to date not been published, which limits the study of biological research and breeding in this species. This study reports a high-quality genome sequence of M. tetraphylla based on the Oxford Nanopore Technologies technology and high-throughput chromosome conformation capture techniques (Hi-C). An assembly of 750.87 Mb with 51.11 Mb N50 length was generated, close to the 740 and 758 Mb size estimates by flow cytometry and k-mer analysis, respectively. Genome annotation indicated that 61.42% of the genome is composed of repetitive sequences and 34.95% is composed of long terminal repeat retrotransposons. Up to 31,571 protein-coding genes were predicted, of which 92.59% were functionally annotated. The average gene length was 6,055 bp. Comparative genome analysis revealed that the gene families associated with defense response, lipid transport, steroid biosynthesis, triglyceride lipase activity, and fatty acid metabolism are expanded in the M. tetraphylla genome. The distribution of fourfold synonymous third-codon transversion showed a recent whole-genome duplication event in M. tetraphylla. Genomic and transcriptomic analysis identified 187 genes encoding 33 crucial oil biosynthesis enzymes, depicting a comprehensive map of macadamia lipid biosynthesis. Besides, the 55 identified WRKY genes exhibited preferential expression in root as compared to that in other tissues. The genome sequence of M. tetraphylla provides novel insights for breeding novel varieties and genetic improvement of agronomic traits.
Keywords