For thymic selection and responses to pathogens, T cells interact through their αβ T cell receptor (TCR) with peptide–major histocompatibility complex (MHC) molecules on antigen-presenting cells. How the diverse TCRs interact with a multitude of MHC molecules is unresolved. It is also unclear how humans generate larger TCR repertoires than mice do. We compared the TCR repertoire of CD4 T cells selected from a single mouse or human MHC class II (MHC II) in mice containing the human TCR gene loci. Human MHC II yielded greater thymic output and a more diverse TCR repertoire. The complementarity determining region 3 (CDR3) length adjusted for different inherent V-segment affinities to MHC II. Humans evolved with greater nontemplate-encoded CDR3 diversity than did mice. Our data, which demonstrate human TCR–MHC coevolution after divergence from rodents, explain the greater T cell diversity in humans and suggest a mechanism for ensuring that any V–J gene combination can be selected by a single MHC II.
A key event in αβ T cell–mediated interactions is the binding of the TCR to its ligand in the form of short peptides, which are bound to MHC molecules on the surface of APCs. To accommodate the vast amount of antigens presented by various MHC molecules, T cells must generate a diverse αβ TCR repertoire. T cells achieve that task by recombining each one of the multiple germline-encoded variable (V), diversity (D), and joining (J) gene segments; nontemplate additions/deletions of nucleotides in the V(D)J junctional region; and random αβ chain pairing (Davis and Bjorkman, 1988). Each T cell expresses a unique TCR. Upon encountering antigens, TCRs also undergo conformational adjustments, a so-called induced-fit binding, to ensure specific recognition of respective peptide-MHCs (pMHCs; Krogsgaard and Davis, 2005).
An old question is how T cells, with such TCR diversity (theoretically ∼1015 clonotypes) and TCR plasticity, react almost exclusively in a MHC-restricted fashion and can react to almost any MHC molecule, considering the great polymorphism of MHC genes (∼15,000 variants in humans; Robinson et al., 2003). Positive selection during T cell development in the thymus imposes self-MHC restriction on T cells because only αβ T cells that bind to self-pMHC complexes with low affinity receive a survival signal (Davis and Bjorkman, 1988; Jameson et al., 1995). Approximately 15% of thymocytes induce signaling for thymic selection; of which, half are negatively selected, likely because of too great an affinity for self-pMHC and cross-reactivity (Merkenschlager et al., 1997; McDonald et al., 2015). The relatively high proportion of MHC-reactive T cells in the preselection pool (∼5–20%) or the fact that ∼10% of the peripheral T cells are MHC alloreactive indicates an intrinsic affinity of TCRs toward MHC (Blackman et al., 1986; Zerrahn et al., 1997; Suchin et al., 2001; Blattman et al., 2002). Namely, the germline-encoded complementarity determining region (CDR) 1 and CDR2 of the Vα and Vβ segments are evolutionarily conserved to react with MHC molecules, which was termed TCR germline bias (Huseby et al., 2005; Marrack et al., 2008; Garcia et al., 2009).
Compelling evidence for this hypothesis resulted from structural and mutational analysis, showing that single amino acid substitutions in a mouse Vβ CDR2, e.g., Tyr48, Tyr50, and Glu54, decreased positive selection in a TCR transgenic mouse model (Dai et al., 2008; Scott-Browne et al., 2009). Furthermore, some Vβ genes of jawed vertebrates (frog, shark, trout, and lizard), which diverged from mammals ∼400 million years ago, share sequences in the CDR2 region of mouse Vβ8.2 but otherwise exhibit little similarity. T cells with chimeric TCRs, containing such Vβ genes, e.g., derived from frogs, were positively selected in mice (Scott-Browne et al., 2011). Further evidence is mounting from the growing database of TCR–pMHC ternary, crystallographic structures (Rossjohn et al., 2015). With few exceptions (Beringer et al., 2015; Rossjohn et al., 2015), many of the solved TCR–pMHC structures to date have adapted a diagonal docking topology atop the pMHC complex. Namely, the CDR1 and CDR2 domains of TCRα or β chains fix over the α2 and α1 helix of MHC class I (MHC I) or β and α helix of MHC II, whereas the CDR3α and the CDR3β are mainly in contact with the presented peptide, respectively (Rossjohn et al., 2015; Adams et al., 2016).
However, not all V gene segments share conserved residues in CDR1 and CDR2. Therefore, it was suggested that each V segment engages to its cognate MHC through a menu of structurally coded recognition motifs that have arisen evolutionarily (Feng et al., 2007; Marrack et al., 2008; Garcia et al., 2009), a comprehensive hypothesis, which, however, is difficult to address experimentally. A number of similarly convincing studies, including the demonstration of antibody-like T cells that developed in coreceptor and MHC-deficient mice and some structural analyses of TCR–pMHC complexes did not support the TCR germline bias for MHC. Hence, it is not generally accepted that TCR and MHC coevolved (Tynan et al., 2005; Gras et al., 2010; Sethi et al., 2011; Tikhonova et al., 2012; Van Laethem et al., 2013; Beringer et al., 2015), which is not surprising, given the complex and flexible interactions that TCR and MHC can undergo.
Structural and mutational analysis of TCR–pMHC complexes depicts only a few of the billions of different possible combinations. Therefore, we wished to address the problem differently, based on several assumptions. We reasoned that thymic selection is the most sensitive readout to detect subtle differences in affinity between a defined MHC molecule and any given TCR. Even though mouse TCRs can be selected on human MHC (Kievits et al., 1987; Ito et al., 1996) and human TCRs can be selected on mouse MHC (Li et al., 2010), we assumed that mouse and human TCR and MHC gene loci further coevolved after their divergence ∼75 million years ago (Waterston et al., 2002), resulting in changes in thymic selection of a polyclonal repertoire, depending on whether the TCR–pMHC interaction was specific for inter- or intraspecies. Therefore, we employed mice with a polyclonal human αβ TCR repertoire, which were deficient for mouse αβ TCRs and expressed either a single human MHC II (HLA-DRA/HLA-DRB1*0401; HLA-DR4, hereafter) or a single-mouse MHC II gene (I-Ab). TCR deep-sequencing of peripheral CD4 T cells from both mouse lines revealed distinct differences in their repertoire, compatible with coevolution of TCR and MHC.
Human TCR gene loci transgenic mice with mouse or human MHC II gene
ABabDII and ABabDR4 mice, which both contain complete human TCRα and TCRβ gene loci and a single MHC II, mouse I-Ab, or human HLA-DR4, were employed in this study. Both strains are deficient for mouse αβ TCRs. ABabDII mice contain the HLA-A*0201 gene and are deficient for mouse MHC I expression (β2m- and Db-deficient), whereas ABabDR4 mice contain two mouse MHC I genes (Kb and Db). HLA-A*0201 and HLA-DR4 are both chimeric molecules allowing mouse CD8 and CD4 coreceptor binding, respectively. The α1 and β1 regions of I-Ab share 56 and 61% homology with the human HLA-DR4 molecule at the amino acid level. Even though I-Ab and HLA-DR4 are not the closest homologues to each other, the TCR repertoire selected by either molecule can be compared, assuming that different MHC II alleles have similar ability to select a diverse repertoire. The cortical and medullary thymic epithelial cells, as well as thymic DCs, which are critical for positive and negative selection of T cells (Klein et al., 2014), expressed comparable levels of MHC II in the two mouse strains (Fig. S1).
Reduced thymic selection by mouse MHC II compared with human MHC II
ABabDII and ABabDR4 mice contained comparable levels of double-positive, CD4 single-positive, and CD8 single-positive cells, as well as CD3+ thymocytes (Fig. 1, A and B). However, ABabDR4 mice contained more CD5/CD69-positive thymocytes (2.106 ± 0.5 × 106 cells) than ABabDII mice did (1.3 × 106 ± 0.3 × 106 cells), indicating that more T cells received a positive/negative selection signal (Fig. 1, A and B). However, thymocyte development in ABabDII and ABabDR4 mice was less efficient than it was in C57BL/6 mice (Fig. 1).
In the periphery, ABabDR4 had more CD4 T cells than ABabDII mice had, which, however, did not reach statistical significance (Fig. 1 C). However, conventional CD4 T cell (Tcon; FoxP3−CD4+CD3+) numbers were significantly greater in the periphery in ABabDR4 compared with ABabDII mice (Fig. 1 D). Regulatory T cell (Treg; FoxP3+CD4+CD3+) numbers in both the thymus and spleen in ABabDII mice were comparable to those in ABabDR4 mice, but the frequency of Treg within the CD4 T cells was substantially greater in ABabDII (19.9 ± 4.4%) compared with ABabDR4 mice (10.8 ± 4.0%; Fig. 1 D and Fig. S2, A and B).
Similar numbers and frequencies of Treg cells expressed high levels of CD44, a homeostatic proliferation marker for naive T cells, although greater frequency of Tcon cells in ABabDII mice were CD44hi (51.7 ± 6.1%) compared with ABabDR4 mice (30.0 ± 15.1%; Fig. 1 D and Fig. S2, A and C). Treg cells and Tcon cells had similar Vβ usages in both ABabDII and ABabDR4 mice, based on staining with 24 human Vβ antibodies (Fig. S2 D).
Because ABabDII and ABabDR4 mice had comparable frequencies of peripheral CD8 T cells, it is unlikely that the different MHC I molecules shaped the development of CD4 T cells (Fig. 1 C). Collectively, these data showed that the development of CD4 T cells with human TCRs differed depending on whether they were selected by mouse or human MHC II.
Diverse but nonrandom V-J usage in both mice and humans
We compared the αβ TCR repertoire of CD4 T cells of ABabDII and ABabDR4 mice by quantitative deep sequencing. The CD8 T cell repertoires were not sequenced because ABabDR4 possess two MHC I alleles, whereas ABabDII mice have only one human MHC I. We included similar numbers of naive (CD62L+/CD45RO−) CD4 T cells from three human donors as a case of TCR selection on multiple MHC II alleles. All CD4 T cells were isolated by FACS sort with purities >97%. We wish to point out, however, that certain parameters, such as V(D)J usage frequency or CDR3 length, but not repertoire diversity, can be compared between the mice and humans because humans contain a set of six different MHC II alleles by which the T cells were selected. Genomic DNA from ∼2.5 × 105 purified CD4 T cells from ABabDII and ABabDR4 mice and ∼1.8 × 105 from human were submitted for sequencing. Between 0.6 × 107 and 1.6 × 107 valid reads were obtained (Table 1).
We analyzed V and J gene usages from both in-frame and out-of-frame TCRs, where the out-of-frame TCRs approximated the preselection pool (Zvyagin et al., 2014). Even though many T cells with functional TCR rearrangement are not positively selected and, thus, are part of the preselection pool (McDonald et al., 2015), the out-of-frame TCRs in T cells selected by their second functional TCR represents an unbiased estimate of V–J usage frequency. Both, the Vα and Vβ gene usage in the preselection repertoire did not differ between the two mouse groups because ABabDII and ABabDR4 mice shared the same TCR transgene loci and employed the same TCR recombination enzymes for rearrangement (i.e., RAG proteins).
Most Vα and Vβ genes were found to be rearranged, except for TRBV5-1, TRBV6-1, and TRAV1, which were previously reported to be missing or not expressed in the ABab transgenic mice (Fig. 2, A and B; Li et al., 2010). Humans showed a similar out-of-frame V gene usage. However, some Vα and Vβ genes were either more frequently (e.g., TRAV16, TRAV39, TRBV23-1, TRBV25-1, TRBV27, TRBV28) or less frequently (e.g., some of the most 5′ located Vα genes, TRBV9, TRBV10-1, TRBV6-5, TRBV19) used in the two mouse lines compared with humans. The reason for these differences is not clear but could be related, in some cases, to polymorphisms. For example, in the promoter region of TRAV39, a deletion of five nucleotides (TTTTC; available from GenBank under accession no. NC_000014, positions 22 and 125–22,130) was detected in the mouse samples, compared with the TRAV39 gene in the three human donors. A similar polymorphism (TTTTC deletion) has been observed in the CD4 promoter, which was associated with lower promoter activity (Kristiansen et al., 2004).
Both Vα and Vβ gene usage was nonrandom (P < 0.0001, χ2 test: actual frequencies to the random gene frequency usage; Fig. 2). The preference for Vα and Vβ gene usage appeared to be different. Vα genes that were closer to the 5′ region of the gene locus were underrepresented. Vβ genes located in the 5′ and 3′ region were preferentially rearranged similarly in mice and humans (i.e., TRBV12-3/4, TRBV21-1, and TRBV27; Fig. 2 B). Jα and Jβ usage was also nonrandom and similar between the transgenic mice and humans (Fig. 2, C and D), although the four Jα segments located closest to the 5′ end were used more frequently in mice than in humans. Most Vα–Jα and Vβ–Jβ gene combinations were detected in both the in-frame and the out-of-frame repertoire (Fig. 3). In general, the postselection repertoire for both the α and the β chains mirrored the usage pattern of the preselection repertoire similarly in ABabDII and ABabDR4 mice (Figs. 2 and 3). However, we also observed changes in the pre- versus postselection repertoire for some V genes and V–J pairing (see I-Ab and HLA-DR4 have distinct imprint in TCR selection).
Larger CD4 T cell repertoire in ABabDR4 compared with ABabDII mice
ABabDII mice were able to select a diverse human TCR repertoire with a mean of 7.5 × 104 ± 0.60 × 104 in-frame TCRα and 8.8 × 104 ± 0.97 × 104 TCRβ amino acid clonotypes from the CD4 T cells submitted for sequencing (Fig. 4 A and Table 1). Thus, I-Ab molecules can positively select all human Vα–Jα and Vβ–Jβ combinations (Fig. 3). However, HLA-DR4 selected significantly more functional TCRα and TCRβ clonotypes compared with I-Ab from the same number of T cells (10.0 × 104 ± 0.87 × 104 and 12.6 × 104 ± 1.83 × 104 TCRα and TCRβ clonotypes, respectively; Fig. 4 A and Table 1). Humans had the most TCRα clonotypes (14.8 × 104 ± 4.08 × 104), likely because of the effect of positive selection by multiple MHC II molecules but, surprisingly, similar numbers of TCRβ clonotypes (11.7 × 104 ± 1.89 × 104) compared with ABabDR4 mice.
There were more medium-to-large and fewer rare-to-small TCRα and TCRβ clonotypes in ABabDII mice than in ABabDR4 mice (Fig. 4, B and C). Most TCRα and TCRβ clonotypes were rare to small in the two younger human donors. The third donor, aged 60 yr, had some hyperexpanded TCR clones and was not included in the clone-size comparison. In general, ABabDR4 mice had a more homogenous TCRα and TCRβ distributions than ABabDII mice had (a mean inequality score of 0.65 ± 0.01 vs. 0.70 ± 0.03 for TCRα and 0.57 ± 0.07 vs. 0.66 ± 0.05 for TCRβ; Fig. 4 D). Two of the three human donors had the lowest inequality scores of all three groups.
Because only ∼0.5% of the total CD4 T cell repertoire from each mouse was sequenced, we applied a computational approach to determine the total TCR repertoire in CD4 T cells from the mice and humans. A lower-bound estimation on the TCR repertoire size was calculated with the acquired number of productive TCR sequences and the number of their templates detected in the sequencing samples using iCHAO1 estimator (Chiu et al., 2014). HLA-DR4 selected significantly more functional TCRα and TCRβ clonotypes (3.7 × 105 ± 0.4 × 105 and 6.9 × 105 ± 0.6 × 105 TCRα and TCRβ clonotypes, respectively) compared with I-Ab (2.4 × 105 ± 0.3 × 105 TCRα and 3.5 × 105 ± 0.2 × 105 TCRβ clonotypes; Fig. 4 E). Humans had the most of both TCRα (7.2 × 105 ± 1.105) and β clonotypes (16.2 × 105 ± 2.2 × 105), likely because of the effect of positive selection by multiple MHC II molecules.
Endogenous superantigens, such as the mouse mammary tumor virus (MMTV) superantigens, could alter CD4 T cell selection, e.g., deletion of T cells with certain Vβ segments, as observed in HLA-DR4 transgenic mice with a mouse TCR repertoire (Ito et al., 1996). We co-cultured purified CellTrace-labeled CD4 T cells from C57BL/6, ABabDII, ABabDR4 and DR4 mice with purified CD19+ cells from ABabDII, ABabDR4 or DR4 mice for 84 h. Compatible with superantigen recognition, C57BL/6 CD4 T cells responded to stimulation with B cells from DR4 and ABabDR4 mice, however, ABabDII and ABabDR4 mice, both expressing the human TCR repertoire, did not proliferate to the B cells of any mouse line (Fig. S3). Thus, we assume that human TCRs, unlike mouse TCRs, only weakly (or not at all) interact with MMTV and conclude that endogenous mouse superantigens did not obscure our results.
I-Ab and HLA-DR4 have distinct imprint in TCR selection
In general, after thymic selection the usage pattern remained similar for both Vα and Vβ to what it was before selection. However, we also observed changes for some V genes. For example, TRBV4-1 was strongly preferred by I-Ab, but not HLA-DR4, molecules (Fig. 2). Interestingly, TRBV4-1 is evolutionarily related to a mouse TCRβ chain (Vβ8.2) in the CDR2 region (Scott-Browne et al., 2011). Similarly, TRAV13-1 and TRBV2 were preferentially selected by I-Ab molecules. Conversely, some V genes were preferentially selected by HLA-DR4 but not I-Ab molecules, i.e., TRBV3-1/2 and TRBV12-3/4.
To analyze the V gene usage pattern more globally, an unsupervised method, principle component analysis (PCA), was performed on the V gene usage profiles of the three groups. PCA separated the human samples on the dominant axis (PC1) but did not do so for the two mouse groups for both in-frame and out-of-frame TCRs, which was also reflected by their closed Euclidean distance (Vα: 3.0 and Vβ: 2.3; Fig. 5, A–C). Before selection, the only variable among the three groups may have been solely species difference. Only after thymic selection did ABabDII and ABabDR4 mice and humans separate from each other and cluster on the PC2 axis, suggesting that different MHC II and number of MHC II molecules influenced the V gene usage and similarly for Vα (Fig. 5 A) and Vβ genes (Fig. 5 B). Indeed, in-frame V gene usage was rather similar within groups (ED score ∼1–2) but significantly different between groups (Fig. 5 C). Based on that observation, we grouped the V genes whose usage frequency changed significantly compared with out-of-frame TCR for the two mouse strains into “overrepresented,” “unchanged,” and “underrepresented” (Fig. 6 A). The same distribution pattern and clustering of the three groups in the post- versus preselection repertoire was observed for Vα–Jα and Vβ–Jβ combinatorial events (Fig. 5, D and E) and, surprisingly, also for Jα and Jβ usage frequency (Fig. S4).
A skewed V–J pairing was observed for functional TCRα and β chains compared with the preselection pool (Fig. 3, A and B). ABabDII and ABabDR4 had almost the same out-of-frame V–J pairing patterns, and the TRAV13-1-TRAJ-54/53 and TRBV28-TRBJ2-3/7 were the most prominently selected in both strains. Despite those similarities, the two MHC II molecules also had their own features, e.g., TRBV12-3/4-TRBJ2-1 was the second most selected in ABabDR4 mice, whereas TRBV2-TRBJ2-7 was the one in ABabDII mice (Fig. 3 B). The distinctive patterns of V–J pairing between the two mouse strains, which are imposed by mouse or human MHC II, could also be seen in the PCA analysis (Fig. 5, D and E) and, by comparison, in their Euclidean distance (Fig. 5 F).
HLA-DR4 selects a longer TCRβ CDR3 compared with I-Ab
The V(D)J junctional region (CDR3) generates most diversity within the TCRs and is the major region for antigen contact and recognition. ABabDII and ABabDR4 mice showed rather similar mean TCRα CDR3 length (∼42 bp) in the preselection as well as the postselection pool (Figs. 6 B and 7 A). After selection, the CDR3 length distribution narrowed, but ABabDR4 mice contained a wider range of CDR3 length than ABabDII did. The TCRα CDR3 length distribution in humans was quite similar to that in the mice, yet slightly broader before or after selection (Fig. 7 A).
The TCRβ CDR3 length distribution also did not differ in the preselection pool between the two mouse strains; however, humans generated, on average, longer and broader CDR3 region (Fig. 7, B and C). After selection, the CDR3 length distribution narrowed as seen for TCRα. The peak of CDR3 length remained the same in ABabDR4 mice postselection (42 bp), approaching that observed in humans, but decreased in ABabDII mice to 39 bp (Fig. 7, B and C). Most TCRβ chains selected by HLA-DR4 molecules had, on average, one amino acid longer CDR3 compared with those selected by I-Ab molecules (Fig. 7, B and C). On average, CDR3 was longer in humans, compared with the two mouse strains in both the pre- and postselection repertoire, which was more apparent for the TCRβ chain, likely because of two recombinatorial events (V–D–J; Fig. 7 C).
To determine whether the human MHCII molecule is imprinted to select for longer CDR3, we deep-sequenced the mouse TCRβ repertoire from CD4 T cells isolated from three C57BL/6 mice. Mouse TCRβ selected on the mouse I-Ab molecule showed a rather similar mean CDR3 length to ABabDR4 mice and human postselection (Fig. 7, B and C). Thus, selection of TCRβ chains with shorter CDR3 was a specific feature of ABabDII mice.
CDR3 length is the net result of exonuclease and terminal deoxynucleotidyl transferase (TdT) activity. ABabDII, ABabDR4, and C57BL/6 mice had similar exonuclease and TdT activity, reflected by their almost identical number of bp deletions and insertions in the CDR3 region of TCRα (Fig. 7 D) and β chains (Fig. 7, D and E). Compared with mice, humans revealed, on average, more deletions and insertions and, thus, had more exonuclease and TdT activity. When analyzing the CDR3 length in groups according to their V gene usage frequencies, the underrepresented Vβ genes in ABabDII mice had, on average, the shortest CDR3 length, whereas in ABabDR4 mice, they had, on average, the longest CDR3 length (Fig. 6 B).
Shared TCRα and β clonotypes
The number of shared clonotypes (based on amino acid sequences) after selection within and between the two mouse strains and humans was strikingly greater than if the repertoire is created randomly (Fig. S5, A and B; Robins et al., 2010). More shared clonotypes were detected within, compared with between, groups (Fig. 8), and ABabDR4 shared more clonotypes among each other (Jaccard index: 0.224 ± 0.006 for TCRα and 0.076 ± 0.003 for TCRβ) than ABabDII mice did (0.193 ± 0.006 for TCRα and 0.067 ± 0.002 for TCRβ).
The number of shared clonotypes beyond MHC restriction was surprisingly high, and the number of shared TCRα clonotypes (0.189 ± 0.006) was significantly greater than shared TCRβ (0.041 ± 0.002) between ABabDII and ABabDR4 mice, likely because of one compared with two somatic recombination events. ABabDII and ABabDR4 generated more shared TCRα and β than they shared with humans (ABabDII with humans: TCRα 0.072 ± 0.017 and TCRβ 0.0057 ± 0.001; ABabDR4: TCRα 0.082 ± 0.020 and TCRβ 0.0070 ± 0.001), most likely because of the higher genetic similarity between the two mouse strains. Notably, random recombination would yield virtually no shared clonotypes within 2.5 × 105 CD4 T cells. The shared clonotypes within and among groups cumulated linearly with the increase in total clonotypes (Fig. S5 B).
Stronger CD4 T cell responses in ABabDR4 compared with DR4 mice
We investigated whether ABabDR4 mice could more efficiently respond to immunization than did DR4 mice expressing a mouse TCR repertoire selected on HLA-DR4. Therefore, we immunized ABabDR4 and DR4 mice with the two HLA-DR4–presented peptides derived from hemagglutinin (HA307–319) or PTPN11mut. PTPN11mut is a somatic mutation in cancer with a single amino acid substitution but is otherwise identical to the mouse homologue. The percentage of IFN-γ+ CD4 T cells in response to both peptides was significantly higher in ABabDR4 compared with DR4 mice (0.37% ± 0.22% vs. 0.04% ± 0.04% for HA307–319 and 0.15% ± 0.06% vs. 0.03% ± 0.02% for PTPN11mut peptide, respectively; Fig. 9). The data suggest that ABabDR4 have more HA307–319 and PTPN11mut CD4 T cell precursors than DR4 mice.
We felt that the controversial discussion about whether and how the TCR and MHC coevolved reached a dead end with good arguments for either site. This is because, in most cases, single TCR–pMHC interactions were analyzed, either by resolving crystallographic structures or by analyzing mutations in the germline-encoded CDR1 and CDR2 (Marrack et al., 2008; Rossjohn et al., 2015). Although the data provided convincing evidence for TCR–MHC coevolution, inherent in the immune system, there are usually exceptions to that rule, so that it is not always easy to distinguish what is the exception and what is the rule. For example, a TCR germline bias (Garcia et al., 2009) is difficult to reconcile with the observation that 85% of the thymocytes do not receive a selection signal and, therefore, apparently lack a sufficient affinity for self-pMHC and that only 7.5% of the thymocytes are MHC or pMHC cross-reactive (Blackman et al., 1986; Merkenschlager et al., 1997; Zerrahn et al., 1997; Huseby et al., 2005; McDonald et al., 2015). Therefore, we addressed the issue differently, based on the assumption that evolving differences in the inherent affinity between TCR and MHC in mice and humans are subtle and that thymic selection is the most sensitive read-out to detect such differences. We investigated the human αβ TCR repertoire because previous studies focused mainly on mouse TCR–MHC interactions, and we analyzed the polyclonal repertoire to encompass the ability of a single MHC allele to select T cells with any possible, functionally rearranged TCR. By comparing the pre- and postselection repertoire selected on a single mouse or human MHC II allele, we indirectly addressed the inherent germline-encoded affinity for any human Vα (Jα) or Vβ (Jβ) segment, which evolved during the some 70-million-year divergence of rodents and humans. Our conclusions became apparent only through massive parallel TCR deep sequencing.
Both TCRα and TCRβ V–J gene usage in CD4 T cells as well as V–J combinatorial frequencies are highly biased, dramatically limiting the theoretically possible T cell repertoire, which was known for the TCRβ repertoire (Robins et al., 2010; Rubelt et al., 2016). The nonrandom usage is hardwired in the human TCR gene loci. It is imprinted in the postselection repertoire but shaped by the respective selecting MHC II molecule, shown by the PCA, and the higher number of shared clonotypes within than between the two mouse lines. Surprisingly, many TCRβ clonotypes were shared, and even more TCRα clonotypes were shared between humans and human MHC II-expressing mice. ABabDR4 mice shared more TCR clonotypes (11% ± 0.3% TCRα and 1.3 ± 0.1% TCRβ chains) with humans than ABabDII mice did (9.6% ± 0.2% TCRα and 1.1 ± 0.2% TCRβ). The abundance of shared TCRα or TCRβ single chains between different species and, independent of the MHC II profile, suggests that αβ chain combinatorial pairing has a larger role for creating diversity than previously thought (Arstila et al., 1999). We could not detect more shared clonotypes between ABabDR4 mice and the only HLA-DR4+ human, which is not surprising because humans bear six MHC II alleles, and which TCR is restricted to which MHC II allele is not known.
Mouse MHC II molecules almost perfectly select a human TCR repertoire, but only, almost. Basically, all human TCRα V-J and TCRβ V–D–J gene combinations were detected in the postselection repertoire of ABabDII mice, demonstrating that “structurally coded recognition motifs for MHC” (Marrack et al., 2008; Garcia et al., 2009) have been selected and fixed in most human V genes before mouse–human divergence. However, ABabDII mice had reduced thymic output and a greater clonality. The difference in I-Ab and HLA-DR4 in selecting a human TCR repertoire became clearly visible in the global comparison when ABabDR4 mice generated 30% more of both TCRα and TCRβ unique clonotypes (amino acids) than did ABabDII mice. This provides a strong hint that mouse MHC II molecules, at least I-Ab, do not select as efficiently a diverse, human αβ TCR repertoire as human MHC II molecules.
The increased TCR repertoire selected by HLA-DR4 compared with I-Ab molecules is likely due to a slightly increased inherent affinity, likely in the CDR1 and CDR2 regions (Marrack et al., 2008), of many human Vα genes for HLA-DR4. The increased TCRβ repertoire in ABabDR4, compared with ABabDII, mice is directly reflected in the CDR3 length. ABabDII and ABabDR4 mice revealed similar CDR3β length distribution in the preselection repertoire. After selection, the mean length of CDR3β selected by human MHC II in ABabDR4 mice and humans was one amino acid longer than that selected by mouse MHC II. The mean, shorter CDR3β was not seen in C57BL/6 mice, where the TCR and MHC were species compatible. Thus, the most likely explanation is that species-specific TCRs evolved to have an optimal intrinsic affinity for their own MHC or vice versa. Assuming that the intrinsic affinity is not optimal between many human TCRs and mouse MHC II, the peripheral CD4 T cell repertoire in ABabDII mice had to adopt a shorter CDR3 domain to become positively selected (Gilfillan et al., 1995; Marten et al., 1999; Yassai et al., 2002). Shorter CDR3 domains increase the risk that T cells will be cross-reactive (Gavin and Bevan, 1995; Huseby et al., 2008). In line with that finding, ABabDR4 mice had more peripheral Tcon cells, compared with ABabDII mice and, interestingly, the frequency of Treg within the CD4 T cell population was substantially higher in ABabDII (19.9 ± 4.4%) compared with ABabDR4 mice (10.8 ± 4.0%). Thus, we assume that CD4 T cells selected for short CDR3 of human TCRs by mouse MHC II are at higher risk of ending as Tregs because of cross-reactivity. Collectively, the interspecies incompatibility between TCR and MHC further supports TCR–MHC coevolution after divergence of the two species. However, the TCR repertoires selected by the closest homologues, e.g., I-Ab and HLA-DQ, need to be analyzed.
In ABabDII mice, underrepresented V genes seem to have suboptimal affinity for the I-Ab molecule because their CDR3 were, on average, the shortest. In contrast, the underrepresented V genes selected by HLA-DR4 had, on average, the longest CDR3. Thus, the underrepresented human V genes in ABabDII mice may have retained or gained affinity for HLA-DR4 but lost it for I-Ab. On the other hand, the underrepresented V genes in ABabDR4 mice may have a too-high, inherent affinity for HLA-DR4, assuming that longer CDR3 decrease the affinity. The picture may change with different MHC II alleles; each of which, may have a set of preferred and nonpreferred V genes (Sharon et al., 2016). We hypothesize that different, inherent affinity of any V segment for any MHC allele is adjusted by CDR3 length, ensuring that T cells with any V segment can be positively selected by any MHC allele, which is supported by the diverse human TCR repertoire in ABabDII mice. HLA-DR4 is capable, but less efficient, in selecting a mouse TCR repertoire than I-Ab because CD4 T cell responses were less efficient in HLA-DR4 transgenic, compared with ABabDR4, mice.
Mice generate, on average, a shorter TCRβ CDR3 region than do humans, which can be seen in the preselection repertoire. Surprisingly, the human recombination machinery evolved to both excise and add more nucleotides in the TCRβ V–D and D–J junctional regions, which is executed by combined exonuclease and TdT activity. Both enzymatic activities might increase TCR diversity. This evolutionary process provides a reasonable explanation for the larger T cell repertoire in humans, which was estimated to be 20-fold higher than that of mice (Arstila et al., 1999; Casrouge et al., 2000; Nikolich-Žugich et al., 2004; Vrisekoop et al., 2014).
In conclusion, our findings suggest that human TCRα and TCRβ V genes acquired an inherent affinity for MHC II before separation of rodents and humans. Afterward, human MHC and TCR gene loci further coevolved to maintain inherent affinity, and in order to compensate for nonrandom V(D)J usage, to increase T cell diversity by focusing on larger nontemplate-encoded CDR3 diversity. Our data also suggest that CDR3 length adjusts for different inherent V segment–MHC affinity and that T cells with shorter CDR3β are at increased risk of becoming Tregs.
Materials and methods
All mouse studies were performed in accordance with institutional, state, and federal (Landesamt für Arbeitsschutz, Gesundheitsschutz und technische Sicherheit, Berlin, Germany) guidelines. C57BL/6 and HLA-DR4 mice were purchased from The Jackson Laboratory and Taconic, respectively. ABabDII transgenic mice have been previously described (Li et al., 2010). ABabDR4 mice were established by crossing ABab transgenic mice (Li et al., 2010) to HLA-DR4 mice (Ito et al., 1996) and selecting mice homozygous for human TCRα and TCRβ gene loci, HLA-DR4 transgene, as well as mouse TCRα, TCRβ, I-Eα, and I-Aβ deficiency. The genotype of the mice was confirmed by PCR. Mice were bred in the Max-Delbrück-Center animal facility under specific pathogen–free condition and were on a mixed 129SV, C57BL/6, and BALB/c genetic background. Mice aged between 8 and 16 wk were used in this study.
Three healthy human donors, aged 30, 48, and 60 yr at the time of blood collection, volunteered to donate blood with informed consent. Blood collecting and processing was performed according to human experimental guidelines under license EA4/046/10 (Ethikkommission). The MHC II profiles were determined by genotyping (Zentrum für Humangenetik und Laboratoriumsdiagnostik, Martinsried, Germany). Details are as follows: donor 1: HLA-DRB1 08:01, 11:12; HLA-DRB3 02; HLA-DQB1 03:01, 04:02; HLA-DPB1 03:01, 04:02; donor 2: HLA-DRB1 04:01, 15:01; HLA-DRB4 01:03; HLA-DRB5 01:01; HLA-DQA1 01:02, 03:01; HLA-DQA1 01:02, 03:01; HLA-DQB1 03:02, 06:02; HLA-DPA1 01:03; HLA-DPB1 04:01; and donor 3: HLA-DRB1 01:01, 13:01; HLA-DRB3 02:02; HLA-DQA1 01:01, 01:03; HLA-DQB1 05:01, 06:03; HLA-DPA1 01:03; HLA-DPB1 04:01.
Fluorochrome-conjugated antibodies specific for mouse CD4 (GK1.5), CD8a (53–6.7), CD5 (53–7.3, isotype: rat IgG2a, κ), IFN-γ (XMG1.2), CD45 (30-F11, isotype: rat IgG2b, κ), CD326 (EpCAM, G8.8, isotype: rat IgG2α, κ), I-Ab (AF6-120.1), CD11c (N418), and FoxP3 (MF-14), and human CD3 (HIT3a), CD8a (HIT8a), CD4 (OKT4), CD45RA (Hl100), CD45RO (UCHL1), and CD62L (DREG-56) were obtained from BioLegend. Mouse CD3ε (145-2C11), CD69 (H1.2F3, isotype: Armenian hamster IgG), Ly51 (6C3, isotype: rat IgG2a, κ), and HLA-DR (L243) specific antibodies were purchased from BD. The TCR Vβ repertoire kit (IOTest Beta Mark) was purchased from Beckman Coulter. UEA I lectin was obtained from GeneTex. Thymus, spleen, and LNs from 1–2-mo-old C57BL/6, ABabDII and ABabDR4 mice were isolated. Cells were obtained by mashing the organs through a 0.45-µm cell strainer. Isolation of thymic DCs and epithelial cells was performed as published (Xing and Hogquist, 2014). In brief, thymic lobes were digested in enzyme solution (RPMI-1640 medium with 0.05% Liberase TH and 100 U/ml of DNase I) at 37°C for 20 min. Single cells were then stained with antibodies specified in the respective figure legends and analyzed by flow cytometry (FACSCanto II; BD). FoxP3 staining was performed with True-Nuclear transcription factor buffer set from BioLegend.
For mouse CD4 T cells, pooled cells from mouse spleen and LNs were collected. For human naive (CD45RO− and CD62L+) CD4 T cell isolation, ∼50 ml fresh blood from human donors was collected, and PBMCs were isolated by the Ficoll density centrifugation method. The cells were sorted by a FACS sorter (FACSARIA III; BD), and the purity for all samples was >95%.
Assay for detection of MMTV
Mouse CD4 T cells and CD19 cells from spleens of C57BL/6, DR4, ABabDII, and ABabDR4 mice were purified using mouse CD4+ T cell isolation kit and CD19 MicroBeads (Miltenyi Biotec). Subsequently, the purified CD4 T cells were labeled with CellTrace Violet (Thermo Fisher Scientific). 3 × 105 labeled CD4 T cells were co-cultured with 1.5 × 106 CD19 cells from different mouse strains at a ratio of 1:5 in a 96-well, round-bottom plate for 84 h at 37°C, with 5% CO2. Measurement of the proliferated CD4 T cells was accessed by flow cytometry.
Genomic DNA isolation and TCR deep sequencing
Genomic DNA was extracted with the QIAGEN blood and tissue kit, quantified with a NanoDrop 1000 (Thermo Fisher Scientific), and stored at −80°C. TCRα and TCRβ deep sequencing and quantification was performed on an immunoSEQ platform (Adaptive Biotechnologies). The technique has a sensitivity of 1 in 200,000 T cells and was optimized to minimize the effect of PCR bias introduced in the first multiplex PCR step (Robins et al., 2009; Carlson et al., 2013). 1.2 µg of genomic DNA, which corresponds to ∼2.5 × 105 mouse and ∼1.8 × 105 human CD4 T cells, was sequenced for each sample. TCR sequences were delineated according to the definition established by the International ImMunoGeneTics Information System collaboration.
ABabDR4 and DR4 mice were immunized twice at an interval of 4 wk. 80 µg hemagglutinin peptide HA307–319 (KYVKQNTLKLATG) or mutant peptide PTPN11492–506 (KTIQMVRSQRSMVQ; G503A mutation), mixed with 100 µl IFA, and 50 µg CpG oligonucleotides were injected s.c. on both sides of the tail base of each mouse. 14 d after the second immunization, the draining LNs were isolated, single cells were restimulated in vitro for 12 h with the respective peptides, and the IFN-γ production by CD4 T cells was measured intracellularly using the kit and protocol from BD (Cytofix/Cytoperm kit).
Data analysis and statistics were performed in Excel (Microsoft), R (R Foundation for Statistical Computing), and Prism (GraphPad Software).
Gene segment (V, J and V-J pairing) frequencies
Calculation of random distribution frequencies was estimated by the reciprocal of the total number of functional TCR genes (V, J, or V–J pairing).
The TCRα and β repertoire sizes of ABabDII and ABabDR4 mice and humans were estimated based on the deep-sequenced samples using iCHAO1 estimator provided with the immunoSEQ platform (Chiu et al., 2014).
Inequality (Gini index) analysis on the in-frame TCR amino acid clonotypes was based on the Lorenz curve.
PCA was performed based on V, J, or V–J combinatorial frequencies from ABabDII and ABabDR4 mice and humans with the “prcomp” function in R software without data normalization (centralizing data). TRAV1, TRBV5-1, and TRBV6-1 were excluded from the analysis because they were known to be missing in the transgenic mouse repertoire.
The Euclidean distance (ED score) was calculated, as shown in Eq. 1, to evaluate the similarities of V gene or V–J pairing usage frequencies within and between different groups:
Estimation of Gaussian CDR3 length distribution
All CDR3 lengths from ABabDII and ABabDR4 mice and humans were assumed to have a Gaussian distribution (R2 > 0.99). We used the variance (square of the SD) of the predicted Gaussian curve to depict the width of the CDR3 distributions.
Absolute number of shared clonotypes
The number of shared clonotypes from any two of the reshaped samples was calculated using the “intersect” function in R software, package tcR (Nazarov et al., 2015).
Jaccard index for similarity analysis
The αβ TCR similarities between any two samples were evaluated with the Jaccard index, which uses the number of shared TCR clonotypes by the number of total clonotypes from the two samples, as shown in Eq. 2:
where A and B represent TCRα or β repertoires from any two samples.
No specified effect size was used to determine sample sizes.
TCR sequencing data underlying this study can be analyzed and downloaded from the Adaptive Biotechnologies immuneACCESS site at https://doi.org/10.21417/B7ZD0D.
Online supplemental material
Fig. S1 includes MHC II staining of thymic APCs from C57BL/6, ABabDII, and ABabDR4 mice. Fig. S2 includes additional data related to Fig. 1, showing FoxP3 and CD44 frequencies in C57BL/6, ABabDII, and ABabDR4 mice, including one representative staining and total columns for Treg and Tcon Vβ usages in ABabDII and ABabDR4 mice. Fig. S3 shows CD4 T cell responses to MMTV superantigen-presenting CD19 cells from C57BL/6, DR4, ABabDII, and ABabDR4 mice. Fig. S4 includes additional data related to Fig. 5, showing PCAs of TRAJ/BJ usages in ABabDII and ABabDR4 mice and human donors. Fig. S5 includes additional data related to Fig. 8, showing the absolute TCRα/β clonotypes shared either in total or from the most- to the least-abundant clonotypes among ABabDII and ABabDR4 mice and human donors.
We thank M. Obenaus for suggestions in data interpretation, bioinformatic support, and critical reading of this manuscript; M. Manzke and C. Westen for genotyping the mice; and A. Dhamodaran for critical reading of the manuscript.
This work was supported by the Deutsche Forschungsgemeinschaft through SFB-TR36.
The authors declare no competing financial interests.
Author contributions: X. Chen developed the concept, performed experiments, analyzed the data, and wrote the manuscript; L. Poncette performed the immunization experiments, analyzed the data, and revised the manuscript; and T. Blankenstein developed the concept, analyzed the data, and wrote the manuscript.
mouse mammary tumor virus
principle component analysis
conventional CD4 T cell
terminal deoxynucleotidyl transferase
regulatory T cell