PLoS Computational Biology (Jun 2014)

Quantification of HTLV-1 clonality and TCR diversity.

  • Daniel J Laydon,
  • Anat Melamed,
  • Aaron Sim,
  • Nicolas A Gillet,
  • Kathleen Sim,
  • Sam Darko,
  • J Simon Kroll,
  • Daniel C Douek,
  • David A Price,
  • Charles R M Bangham,
  • Becca Asquith

DOI
https://doi.org/10.1371/journal.pcbi.1003646
Journal volume & issue
Vol. 10, no. 6
p. e1003646

Abstract

Read online

Estimation of immunological and microbiological diversity is vital to our understanding of infection and the immune response. For instance, what is the diversity of the T cell repertoire? These questions are partially addressed by high-throughput sequencing techniques that enable identification of immunological and microbiological "species" in a sample. Estimators of the number of unseen species are needed to estimate population diversity from sample diversity. Here we test five widely used non-parametric estimators, and develop and validate a novel method, DivE, to estimate species richness and distribution. We used three independent datasets: (i) viral populations from subjects infected with human T-lymphotropic virus type 1; (ii) T cell antigen receptor clonotype repertoires; and (iii) microbial data from infant faecal samples. When applied to datasets with rarefaction curves that did not plateau, existing estimators systematically increased with sample size. In contrast, DivE consistently and accurately estimated diversity for all datasets. We identify conditions that limit the application of DivE. We also show that DivE can be used to accurately estimate the underlying population frequency distribution. We have developed a novel method that is significantly more accurate than commonly used biodiversity estimators in microbiological and immunological populations.