PLoS Genetics (Apr 2014)

Analysis of the genome and transcriptome of Cryptococcus neoformans var. grubii reveals complex RNA expression and microevolution leading to virulence attenuation.

  • Guilhem Janbon,
  • Kate L Ormerod,
  • Damien Paulet,
  • Edmond J Byrnes,
  • Vikas Yadav,
  • Gautam Chatterjee,
  • Nandita Mullapudi,
  • Chung-Chau Hon,
  • R Blake Billmyre,
  • François Brunel,
  • Yong-Sun Bahn,
  • Weidong Chen,
  • Yuan Chen,
  • Eve W L Chow,
  • Jean-Yves Coppée,
  • Anna Floyd-Averette,
  • Claude Gaillardin,
  • Kimberly J Gerik,
  • Jonathan Goldberg,
  • Sara Gonzalez-Hilarion,
  • Sharvari Gujja,
  • Joyce L Hamlin,
  • Yen-Ping Hsueh,
  • Giuseppe Ianiri,
  • Steven Jones,
  • Chinnappa D Kodira,
  • Lukasz Kozubowski,
  • Woei Lam,
  • Marco Marra,
  • Larry D Mesner,
  • Piotr A Mieczkowski,
  • Frédérique Moyrand,
  • Kirsten Nielsen,
  • Caroline Proux,
  • Tristan Rossignol,
  • Jacqueline E Schein,
  • Sheng Sun,
  • Carolin Wollschlaeger,
  • Ian A Wood,
  • Qiandong Zeng,
  • Cécile Neuvéglise,
  • Carol S Newlon,
  • John R Perfect,
  • Jennifer K Lodge,
  • Alexander Idnurm,
  • Jason E Stajich,
  • James W Kronstad,
  • Kaustuv Sanyal,
  • Joseph Heitman,
  • James A Fraser,
  • Christina A Cuomo,
  • Fred S Dietrich

DOI
https://doi.org/10.1371/journal.pgen.1004261
Journal volume & issue
Vol. 10, no. 4
p. e1004261

Abstract

Read online

Cryptococcus neoformans is a pathogenic basidiomycetous yeast responsible for more than 600,000 deaths each year. It occurs as two serotypes (A and D) representing two varieties (i.e. grubii and neoformans, respectively). Here, we sequenced the genome and performed an RNA-Seq-based analysis of the C. neoformans var. grubii transcriptome structure. We determined the chromosomal locations, analyzed the sequence/structural features of the centromeres, and identified origins of replication. The genome was annotated based on automated and manual curation. More than 40,000 introns populating more than 99% of the expressed genes were identified. Although most of these introns are located in the coding DNA sequences (CDS), over 2,000 introns in the untranslated regions (UTRs) were also identified. Poly(A)-containing reads were employed to locate the polyadenylation sites of more than 80% of the genes. Examination of the sequences around these sites revealed a new poly(A)-site-associated motif (AUGHAH). In addition, 1,197 miscRNAs were identified. These miscRNAs can be spliced and/or polyadenylated, but do not appear to have obvious coding capacities. Finally, this genome sequence enabled a comparative analysis of strain H99 variants obtained after laboratory passage. The spectrum of mutations identified provides insights into the genetics underlying the micro-evolution of a laboratory strain, and identifies mutations involved in stress responses, mating efficiency, and virulence.