Gapless genome assembly of Colletotrichum higginsianum reveals chromosome structure and association of transposable elements with secondary metabolite gene clusters
Jean-Félix Dallery,
Nicolas Lapalu,
Antonios Zampounis,
Sandrine Pigné,
Isabelle Luyten,
Joëlle Amselem,
Alexander H. J. Wittenberg,
Shiguo Zhou,
Marisa V. de Queiroz,
Guillaume P. Robin,
Annie Auger,
Matthieu Hainaut,
Bernard Henrissat,
Ki-Tae Kim,
Yong-Hwan Lee,
Olivier Lespinet,
David C. Schwartz,
Michael R. Thon,
Richard J. O’Connell
Affiliations
Jean-Félix Dallery
UMR BIOGER, INRA, AgroParisTech, Université Paris-Saclay
Nicolas Lapalu
UMR BIOGER, INRA, AgroParisTech, Université Paris-Saclay
Antonios Zampounis
UMR BIOGER, INRA, AgroParisTech, Université Paris-Saclay
Sandrine Pigné
UMR BIOGER, INRA, AgroParisTech, Université Paris-Saclay
Isabelle Luyten
UR1164 URGI, INRA
Joëlle Amselem
UR1164 URGI, INRA
Alexander H. J. Wittenberg
KeyGene N.V.
Shiguo Zhou
Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, University of Wisconsin-Madison
Marisa V. de Queiroz
Laboratório de Genética Molecular de Fungos, Universidade Federal de Viçosa
Guillaume P. Robin
UMR BIOGER, INRA, AgroParisTech, Université Paris-Saclay
Annie Auger
UMR BIOGER, INRA, AgroParisTech, Université Paris-Saclay
Matthieu Hainaut
CNRS UMR 7257, Aix-Marseille University
Bernard Henrissat
CNRS UMR 7257, Aix-Marseille University
Ki-Tae Kim
Department of Agricultural Biotechnology, Center for Fungal Genetic Resources, Seoul National University
Yong-Hwan Lee
Department of Agricultural Biotechnology, Center for Fungal Genetic Resources, Seoul National University
Olivier Lespinet
Laboratoire de Recherche en Informatique, CNRS, Université Paris-Sud
David C. Schwartz
Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, University of Wisconsin-Madison
Michael R. Thon
Instituto Hispano-Luso de Investigaciones Agrarias (CIALE), Department of Microbiology and Genetics, University of Salamanca
Richard J. O’Connell
UMR BIOGER, INRA, AgroParisTech, Université Paris-Saclay
Abstract Background The ascomycete fungus Colletotrichum higginsianum causes anthracnose disease of brassica crops and the model plant Arabidopsis thaliana. Previous versions of the genome sequence were highly fragmented, causing errors in the prediction of protein-coding genes and preventing the analysis of repetitive sequences and genome architecture. Results Here, we re-sequenced the genome using single-molecule real-time (SMRT) sequencing technology and, in combination with optical map data, this provided a gapless assembly of all twelve chromosomes except for the ribosomal DNA repeat cluster on chromosome 7. The more accurate gene annotation made possible by this new assembly revealed a large repertoire of secondary metabolism (SM) key genes (89) and putative biosynthetic pathways (77 SM gene clusters). The two mini-chromosomes differed from the ten core chromosomes in being repeat- and AT-rich and gene-poor but were significantly enriched with genes encoding putative secreted effector proteins. Transposable elements (TEs) were found to occupy 7% of the genome by length. Certain TE families showed a statistically significant association with effector genes and SM cluster genes and were transcriptionally active at particular stages of fungal development. All 24 subtelomeres were found to contain one of three highly-conserved repeat elements which, by providing sites for homologous recombination, were probably instrumental in four segmental duplications. Conclusion The gapless genome of C. higginsianum provides access to repeat-rich regions that were previously poorly assembled, notably the mini-chromosomes and subtelomeres, and allowed prediction of the complete SM gene repertoire. It also provides insights into the potential role of TEs in gene and genome evolution and host adaptation in this asexual pathogen.