Data in Brief (Apr 2021)
Characterization of human T cell receptor repertoire data in eight thymus samples and four related blood samples
Abstract
T cell receptor (TCR) is a heterodimer consisting of TCRα and TCRβ chains that are generated by somatic recombination of multiple gene segments. Nascent TCR repertoire undergoes thymic selections where non-functional and potentially autoreactive receptors are removed. During the last years, the development of high-throughput sequencing technology has allowed a large scale assessment of TCR repertoire and multiple analysis tools are now also available.In our recent manuscript, Human thymic T cell repertoire is imprinted with strong convergence to shared sequences [1], we show highly overlapping thymic TCR repertoires in unrelated individuals. In the current Data in Brief article, we provide a more detailed characterization of the basic features of these thymic and related peripheral blood TCR repertoires. The thymus samples were collected from eight infants undergoing corrective cardiac surgery, two of whom were monozygous twins [2]. In parallel with the surgery, a small aliquot of peripheral blood was drawn from four of the donors. Genomic DNA was extracted from mechanically released thymocytes and circulating leukocytes. The sequencing of TCRα and TCRβ repertoires was performed at ImmunoSEQ platform (Adaptive Biotechnologies). The obtained repertoire data were analysed applying relevant features from immunoSEQ® 3.0 Analyzer (Adaptive Biotechnologies) and a freely available VDJTools software package for programming language R [3].The current data analysis displays the basic features of the sequenced repertoires including observed TCR diversity, various descriptive TCR diversity measures, and V and J gene usage. In addition, multiple methods to calculate repertoire overlap between two individuals are applied. The raw sequence data provide a large database of reference TCRs in healthy individuals at an early developmental stage. The data can be exploited to improve existing computational models on TCR repertoire behaviour as well as in the generation of new models.