BMC Bioinformatics (May 2019)

Balancing sensitivity and specificity in distinguishing TCR groups by CDR sequence similarity

  • Neerja Thakkar,
  • Chris Bailey-Kellogg

DOI
https://doi.org/10.1186/s12859-019-2864-8
Journal volume & issue
Vol. 20, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Background Repertoire sequencing is enabling deep explorations into the cellular immune response, including the characterization of commonalities and differences among T cell receptor (TCR) repertoires from different individuals, pathologies, and antigen specificities. In seeking to understand the generality of patterns observed in different groups of TCRs, it is necessary to balance how well each pattern represents the diversity among TCRs from one group (sensitivity) vs. how many TCRs from other groups it also represents (specificity). The variable complementarity determining regions (CDRs), particularly the third CDRs (CDR3s) interact with major histocompatibility complex (MHC)-presented epitopes from putative antigens, and thus encode the determinants of recognition. Results We here systematically characterize the predictive power that can be obtained from CDR3 sequences, using representative, readily interpretable methods for evaluating CDR sequence similarity and then clustering and classifying sequences based on similarity. An initial analysis of CDR3s of known structure, clustered by structural similarity, helps calibrate the limits of sequence diversity among CDRs that might have a common mode of interaction with presented epitopes. Subsequent analyses demonstrate that this same range of sequence similarity strikes a favorable specificity/sensitivity balance in distinguishing twins from non-twins based on overall CDR3 repertoires, classifying CDR3 repertoires by antigen specificity, and distinguishing general pathologies. Conclusion We conclude that within a fairly broad range of sequence similarity, matching CDR3 sequences are likely to share specificities.

Keywords