INTRODUCTION
Sexual transmission of human immunodeficiency virus type 1 (HIV-1) involves a genetic bottleneck (reviewed in references
1 and
2) that typically results in the acquisition of a single transmitted/founder virus (
3–8). The low transmission rates of HIV-1 are most consistent with infection by a single variant rather than transmission of a larger population with differential outgrowth. Understanding the biological determinants of the transmission bottleneck is central to developing appropriate strategies to block transmission either by vaccination or other interventions. Thus, there continues to be an interest in characterizing the nature of the transmitted virus as one source of information about the selective pressures at work during transmission.
There are several reasons why there could be phenotypic differences between the transmitted virus and viruses typically found in the blood of the donor. Such differences could arise if the virus population in the donor's genital tract is a distinct subset of the virus in the donor (i.e., compartmentalization). In addition, phenotypic differences may arise due to selection at any of several steps in the transmission process. First, selection may favor viruses that are better able to cross the mucosal surface to reach an initial target cell. Second, selection at the site of transmission may favor viruses capable of infecting available target cells, such as T cells or macrophages, or interacting with dendritic cells (DCs). Third, selection may favor viruses capable of rapid replication in the initial target cell. Fourth, selection may favor viruses whose phenotypes direct them to specific sites, such as the gut-associated lymphoid tissue, where the amount of virus is greatly amplified early after infection. One or more of these features of transmission may impact the biological properties of the virus that initiates infection.
The nature of the transmitted virus has been examined in several settings. Initially, these variants were thought to be macrophage-tropic (
9,
10). While the transmitted virus is predominantly an R5 virus (
8,
10–15), more recent evidence indicates that macrophage tropism is not required for transmission, and in several studies, there were no examples of transmission of a virus that could efficiently replicate in macrophages (
12,
16–18) or infect cells with low levels of CD4 (a surrogate for macrophage tropism) (
11,
13).
Other features of the transmitted variant have also been examined. In several studies, there has been consistent observation of reduced length of the variable regions in the Env protein and/or reduced N-linked glycosylation density of the Env protein, features reported for subtypes C, A, and D HIV-1 (
4,
19,
20) but less clear for subtype B HIV-1 (
19,
21,
22). An analysis of a large data set of subtype B
env sequences revealed in the transmitted viruses selection for a basic amino acid at position 12 in the Env leader sequence that increases Env density on virions and underrepresentation of a glycosylation site at codon 413 (
23,
24), although this site is typically not present in subtype C HIV-1. It has been reported that the reduced glycosylation of the Env protein of transmitted viruses enhances binding to α4β7 integrin associated with CD4
+ T cells found in gut-associated lymphoid tissue and impacts Env conformation and the interaction with CD4 (
25), although this relationship was not detected in a larger sampling of transmitted viruses (
13). Other reported features of the transmitted virus are increased neutralization sensitivity to autologous donor antibodies but not heterologous antibodies for subtype C HIV-1 (
4) and increased sensitivity to antibodies that bind to the CD4 binding site, suggesting an altered Env conformation, for subtype B HIV-1 (
15), although this was not seen for subtype C HIV-1 (
13).
N-linked glycosylation plays an important role in the biology of the viral Env protein (reviewed in reference
26). There are approximately 30 glycosylation sites encoded in the extracellular domain of the Env protein, with roughly two-thirds encoded in the relatively conserved domains of Env and one-third in the highly variable regions (
27,
28). These sites are present at high frequency, such that carbohydrate accounts for 50% of the protein weight (
29). After processing, the added glycan is largely left in a high-mannose configuration (
30). An initial mutational analysis of encoded N-linked glycosylation sites showed that they were largely not essential for viral replication, leading to the suggestion that their primary role was immune evasion (
31). Subsequent studies have examined the role of glycosylation in neutralization sensitivity and the evolution of neutralization resistance (
32–50), supporting the hypothesis that the carbohydrate side chains function as a glycan shield protecting the surface of the Env protein from host antibodies (
47). However, there is great variability in the number of encoded glycosylation sites in the
env gene, pointing to a dynamic system where sites are being selected for or against to create the observed diverse viral population.
Because of the extreme heterogeneity of the HIV-1 Env protein, it is important that concepts concerning HIV-1 transmission be formulated based on large sample sizes. Here, we compared the sequences of a large number of viral Env proteins from acute/early infections (n = 68) to Env proteins present in contemporaneous chronic infections (n = 62) in the setting of heterosexual transmission of subtype C HIV-1. We found that the Env protein of the transmitted virus was 5% underglycosylated on average compared to Env proteins in chronically infected subjects, with the virus found in acutely infected men being 7% underglycosylated relative to the virus found in chronically infected women. The difference between acutely infected women and chronically infected men was much less pronounced, suggesting that underglycosylation is principally a feature of female-to-male transmission and not a general feature of all transmission types. A subset of the sequences analyzed were cloned into an expression vector to assess the phenotypic characteristics of the Env protein in pseudotyped virus assays. The transmitted viruses were not differentially sensitive to heterologous neutralizing antibodies (with one exception), consistent with the transmitted virus having a conformation similar to that of the virus in chronically infected subjects with respect to antibody sensitivity. In addition, we found, using a more quantitative assay for CD4 dependence in entry, that the transmitted viruses required high levels of CD4 to infect cells, consistent with an activated CD4+ T cell and not a macrophage being the initial target cell for replication after transmission. Both types of viruses could use a form of CCR5 that was inhibited by maraviroc, a CCR5 antagonist. However, only a fraction of the transmitted viruses were able to utilize an alternative conformation of CCR5 that was insensitive to maraviroc, while a majority of viruses from chronic infections were able to use this conformation. Collectively, these differences suggest selective pressures at work during transmission. In addition, the underglycosylation of the transmitted/founder virus highlights the fact that the presence of many of the glycosylation sites is variable. We propose that these glycosylation sites can be grouped by their proximity to specific surface structures of the Env protein and that the absence of glycosylation sites may make the virus vulnerable to antibodies targeting these protein surface structures.
(These results were presented in part at the March 2012 CROI Meeting.)
DISCUSSION
Understanding the selective pressures acting on the Env protein during transmission informs our knowledge of the mechanism of transmission and is essential to the development of prevention strategies. Several previous studies have observed that the transmitted virus is underglycosylated in an HIV-1 subtype C cohort (
4) and in an HIV-1 subtype A cohort (
19), although this effect was less apparent overall in a subtype B cohort (
19,
21,
22), with the exception of underrepresentation of a glycosylation site at position 413 (
24). We have been able to confirm this observation of reduced glycosylation of the transmitted virus in this large subtype C cohort (
Fig. 3A). Consistent with the underglycosylation of the Env protein in the transmitted virus, the virus accumulates glycosylation sites during the course of infection (
70,
78,
79). Thus, there appears to be a cycling of glycosylation sites, with a reduction at the time of transmission and an accumulation of sites over time.
Based on our and other studies (
4,
19), it is clear that viruses with fewer glycosylation sites can have a transmission advantage in certain settings, but the magnitude and mechanism of this advantage is unknown. We have estimated what the magnitude of this advantage would have to be in order to transform the distribution of glycosylation counts among viruses from chronically infected subjects into the distribution seen in the transmitted viruses. We estimate that an increase in the probability of transmission of 20% for each lost glycosylation site could account for the differences in these distributions (
Fig. 3A). A persistent question has been whether this transmission advantage is due to selection at specific glycosylation sites or selection for an overall reduction in glycan number. Our observation that two-thirds of the reduction in glycosylation sites was associated with the variable regions of Env indicates that the transmission advantage is likely not due to a specific glycan but, rather, that transmission of subtype C HIV-1 in the context of heterosexual transmission favors viruses whose Env proteins have overall fewer glycans.
The fact that subtypes differ in the extent to which they favor transmission of underglycosylated viruses may provide insight into the mechanism of this transmission advantage. We hypothesize that this difference between subtypes B and C is due to differences in their predominant modes of transmission. This suggestion is supported by two observations. First, the subtype B HIV-1 epidemic is dominated by male-to-male transmission, and the magnitude of the transmission advantage of reduced glycosylation is much less apparent. In contrast, subtype C HIV-1 is predominantly heterosexually transmitted, and transmitted HIV-1 subtype C variants have a more pronounced reduction in glycan number. Second, we show that female-to-male transmission is associated with a larger reduction in glycosylation count than male to female (
Fig. 3). Together, this suggests that transmission from females selects for underglycosylated viruses.
Transmission from females requires that the virus maintain infectivity after being secreted into the cervicovaginal mucus. Humans make a protein that binds mannose, mannose binding lectin (MBL), and this protein has been shown to be capable of neutralizing HIV-1 (
80). It is possible that MBL in vaginal secretions traps or neutralizes virus in a way that is enhanced by higher glycosylation density. This model is consistent with the recent observation of a reduced glycosylation count in viruses transmitted intrapartum but not intrauterine (
81). Alternatively, differences in Langerhans cells (LCs) in the male epithelium and the female epithelium (such as the differences observed between skin and vaginal LCs [
82]) could result in either enhancement of
trans-infection of T cells by underglycosylated viruses or inhibition of
trans-infection of T cells by viruses that bind surface lectins too tightly due to higher levels of glycosylation.
It is now clear that transmission involves the infection of a cell with high levels of CD4, which identifies CD4
+ T cells as the target. Studies of tissue at the site of infection in recently infected macaques found that T cells were the predominant cell type infected (
83). Several studies of human cohorts have failed to find examples of transmitted viruses that can efficiently enter macrophages (
12,
16–18) or infect cells with low levels of CD4 (a surrogate for macrophage tropism) (
11,
13). We have used a large sample size and a quantitative assay for CD4 dependence to show that the viruses transmitted in heterosexual transmission require high levels of CD4 for entry, i.e., the high levels found on activated T cells and not on macrophages (
Fig. 1A). Given the low levels of CD4 on dendritic cells from the peripheral blood (
84) or the gut mucosa (
85), it is unlikely that infection of DCs plays a role in the transmission process. Finally, our failure to find viruses isolated from the blood of chronically infected individuals that have the ability to enter cells with low levels of CD4 indicates that the evolution of macrophage-tropic virus is likely restricted to very specific circumstances, such as in the CNS, where we have been able to show a link between the infection of a long-lived cell and the presence of virus able to enter cells with low levels of CD4 (
69,
86). Thus, the transmitted virus, and most examples of HIV-1 found in the blood, are appropriately called R5 T cell-tropic.
Transmitted viruses and viruses from chronic infections do not differ in how their Env proteins interact with CD4. This is based on two assays: inhibition by soluble CD4 (
Fig. 4) and entry efficiency as a function of CD4 density (
Fig. 1A). Previous work suggested that a property of transmitted viruses is high α4β7 integrin binding and low binding of gp120 to a monomer of soluble CD4, with the typical HIV-1 Env protein having the reverse property (
25). Similar to our results with subtype C HIV-1, no difference in sensitivities to soluble CD4 has been seen for subtype B HIV-1 in comparing acute/transmitted viruses and viruses from chronic infection (
8,
71), although a difference has been reported for neutralization sensitivities to CD4 binding site antibodies for subtype B (
15) but not subtype C HIV-1 (
13) (
Fig. 4A). Thus, if there are differences in the interaction with CD4, they are small and largely assay specific. Similarly, a recent analysis of a panel of transmitted viruses and viruses from chronic infection failed to confirm a role for α4β7 binding as a specific feature of the transmitted virus (
13). Overall, we found no difference in neutralization sensitivities between viruses from acute versus chronic infection, with the exception of one polyclonal serum (
Fig. 4B). Thus, we conclude that there are not significant differences in either conformation or heterologous neutralization sensitivity that distinguish the transmitted virus. The relationship between underglycosylation of the transmitted virus and neutralization sensitivity is less clear. In 13 of the 18 examples where glycosylation alters neutralization sensitivity, underglycosylation increases sensitivity to neutralization (
Fig. 4C), but connecting this pattern to transmission is difficult given that underglycosylation of the transmitted virus is likely a small signal in the background of high variability of both neutralization sensitivity and overall glycosylation diversity.
Another feature of the transmitted virus phenotype is an increased ability to use a maraviroc-sensitive conformation of CCR5. G protein-coupled receptors (GPCRs) have extreme levels of structural flexibility that allow them to bind many different ligands and achieve ligand-specific conformations (reviewed in reference
87). Epitope mapping studies have found that some anti-CCR5 monoclonal antibodies recognize a greater fraction of the total CCR5 molecules than other antibodies, thus suggesting that, like other GPCRs, CCR5 exists in a number of conformations (
88,
89). While the nature of CCR5 conformational variation has not been extensively studied, it appears to be substantial on both cultured (
88,
89) and primary T cells (
88) and could be generated by a number of mechanisms, including posttranslational modification (
90,
91), the lipid environment (
92), and ligand or G protein binding (
87). Furthermore, some CCR5 antagonist-resistant viruses have been shown to differ in their sensitivities to anti-CCR5 monoclonal antibodies, raising the possibility that HIV-1 can evolve the ability to bind these alternative CCR5 conformations as a resistance pathway (
88). Similarly, resistance to maraviroc has been shown to vary by cell line and by donor (
93), consistent with the idea that different cells can display different forms of CCR5.
Here, we show that viruses isolated from chronically infected subjects are more often partially resistant to maraviroc than viruses isolated from acutely infected subjects. While 75% of the viruses from chronic infections had the ability to use this alternative conformation, only about 40% of the transmitted/founder viruses could use this conformation (
Fig. 1B and
2). Because the subjects used in this study were treatment naive, these innate resistance levels reveal variation in how viruses interact with CCR5. This interpretation implies that viruses from chronic infection are more variable in how they interact with CCR5 and that they are able either to infect using maraviroc-bound CCR5 or using a CCR5 conformation that maraviroc is unable to bind. It is possible that this diversity in the ability to use different CCR5 conformations is analogous to the evolution to use CXCR4 as a coreceptor late in infection, possibly allowing the virus to grow in a different subset of T cells. Several studies have shown that R5 viruses taken from late in disease are more difficult to inhibit with agents that bind to CCR5 (
14,
94). Since the viruses from chronic infection have an expanded CCR5 coreceptor usage capacity, it is difficult to explain a selective pressure that would restrict this capacity in the transmitted virus. Perhaps the expanded coreceptor capacity allows the virus to infect a cell type that is on average less productive for successful transmission. Alternatively, this expanded coreceptor capacity may be linked to some other feature of Env that is important for transmission or to the level of glycosylation. We do not know which form of CCR5 predominates on CD4
+ T cells, although the therapeutic efficacy of maraviroc suggests that it is largely the sensitive form (
95). However, a recent failure of maraviroc to block rectal transmission of the simian-human immunodeficiency virus (SHIV) isolate 162p3 in macaques in the face of high drug exposure (
96) may suggest a role for this alternative conformation in at least some settings. Parrish et al. (
13) did not observe a resistance plateau using a largely nonoverlapping set of subtype C HIV-1 isolates, although this analysis was done using a cell line (NP2/CD4/CCR5) that expresses a single, lower level of CCR5. These investigators have recently repeated this experiment based on the results reported in our manuscript and have confirmed the difference in transmitted viruses versus chronic viruses (
97).
The modest underglycosylation of transmitted viruses (
Fig. 3) emphasizes the fact that glycosylation is a dynamic state for HIV-1, with some glycosylation sites in the highly variable loops being poorly conserved and other sites in the conserved regions of Env being moderately or highly conserved. There are a number of examples, from this and previous studies, indicating that there is an intimate relationship between glycosylation at specific sites in Env and sensitivity or resistance to antibody neutralization. We propose that a vaccine that targets structural epitopes underlying multiple, variable glycosylation sites could neutralize a large fraction of viruses.
There are several examples where inclusion of part of the carbohydrate structure into the epitope has resulted in an antibody that has broad neutralizing capacity, such as the monoclonal antibody 2G12 (
98,
99), the monoclonal antibody PGT 128, which shows a specificity for the presence of a glycan at position 332 (
100), and the broadly neutralizing antibodies PG9 and PG16, which include specific glycans in the V1/V2 β-sheet scaffold as part of the epitope (
35,
74,
101). However, epitopes that include carbohydrate must be the exception to what the virus experiences, since selection maintains such a high glycosylation density as a general strategy to avoid neutralization.
There is also evidence that glycosylation sites can confer resistance to neutralization, as opposed to being the target of neutralization. Changes in glycosylation within the variable loops represent an important path of escape from autologous neutralizing antibodies (
77). There are also examples where moderately conserved glycans have been implicated in protecting specific features of the Env protein surface. The glycan at position 386 is a major determinant of sensitivity to the CD4 binding site antibody b12 (
36,
73), and we were able to see this effect across our data set, as well as a potency effect on the CD4 binding site antibody VRC01 by glycosylation at 386 and the spatially adjacent site 276 (
Fig. 4C). Similarly, the α2 helix can be an important target for neutralization (
42,
102), with one example where escape was mediated by the addition of a glycan at position 339 (
103). More recently, Moore et al. (
104) have observed escape from an autologous neutralization response by the addition of a glycan at position 332, which then conferred sensitivity to the glycan-dependent monoclonal antibody PGT 128. Given that glycans are typically present in the conserved regions of Env (
Fig. 3D), the antibodies that react with the surface protein structures of Env otherwise covered by these glycans would appear as largely autologous, i.e., of restricted range in their neutralization properties. A more systematic approach is needed to determine the breadth of the neutralization properties of antibodies to these surface structures among isolates where the same glycan is missing.
If these moderately conserved glycans are viewed as masks for the Env protein surface, then these carbohydrate side chains can be grouped as protecting specific structures (
Fig. 5). The glycans at positions 262, 332, 442, and 446/448 are all placed such that they could occlude the surface of the three-stranded β-sheet β12/13/22. The glycans at positions 276 and 386 are positioned to protect the protein surface around the CD4 binding site. Similarly, a group of glycans (positions 88, 230, 234, and 241) in the region where gp120 and gp41 are thought to interact (
105,
106) could represent a distinct domain where glycans are occluding putative neutralization epitopes. The glycan at position 337/339 sits on the α2 helix. The single glycans on loop C (position 289) and loop E (position 356/358) could similarly occlude specific epitopes in these loops. Finally, the newly described four-stranded β-sheet that provides a scaffold for the V1 and V2 surface loops is covered with four glycans (positions 130, 156, 160, and 197) (
74).
Given that these glycosylation sites are variably present, we asked what the mean occupancy of these sites was collectively among the transmitted viruses. As can be seen by the results in
Figure 5, 93% of the transmitted viruses are missing a glycan associated with at least one of these structures, and 74% have at least one glycan missing from two or more separate structures. If these moderately conserved glycosylation sites protect epitopes that can be recognized by the host, then the absence of some of these glycosylation sites in most isolates should make the virus sensitive to antibodies targeted to these surface structures of Env. In much the same way that the virus toggles between cytotoxic T lymphocyte epitope escape mutations and reversion to wild-type sequence (
107–109), the virus may also toggle between exposing an otherwise carbohydrate-occluded epitope in the absence of the selective pressure of neutralizing antibodies to that site in a specific host and retaining (or evolving anew) the glycosylation site when there is a host response to that protein surface epitope. A vaccine approach that was able to elicit a response to these surface structures could represent a combinatorial approach to neutralizing a large majority of transmitted viruses, targeting the differing subsets of structures exposed on different viruses, an approach that is already being explored for the CD4 binding site (
110). This strategy would take advantage of the variable presence of these glycosylation sites in the conserved domains of Env among the entire viral population, an advantage that is enhanced by the further reduction in glycosylation of the transmitted virus.