Retroviruses, including human immunodeficiency virus type 1 (HIV-1), are diploid, containing two genomic RNA molecules per virion. This viral RNA serves as the template for proviral DNA synthesis by the virus-encoded enzyme reverse transcriptase (RT). HIV-1 is characterized by its rapid genetic evolution. Genetic diversity in HIV-1 is well evidenced from the large number of different HIV-1 strains isolated around the world, which have been divided into three groups. The major group has been further divided into 10 nucleotide sequence-defined subtypes (
24,
37,
39). Due to the rapid genetic changes, HIV-1 in vivo is defined as a quasispecies, that is, a population of highly related yet genetically distinct viruses within the same individual (
11,
46). Sequential analysis of the quasispecies from infected patients shows substantial variation of genetic information over the course of infection (
4,
8,
28,
45). This rapid genetic variation provides HIV-1 with maximum adaptation efficiency and poses serious challenges for chemotherapy and vaccine development for HIV-1 infection.
One mechanism that contributes to HIV-1 genetic variation is its high mutation rate. The RT of HIV-1 is error prone, in part due to its lack of proofreading activity. Mutation rate studies indicate that the HIV-1 genome averages 0.3 nucleotide change per cycle of virus replication (
25; P. K. O'Neil, G. Sun, B. D. Preston, and J. P. Dougherty, unpublished data).
Another means to generate genetic changes is through recombination. Since two RNA molecules are packaged in each virion, RT may switch from one template to another during reverse transcription. If two RNAs with sequence differences are copackaged in one virion, a mosaic HIV-1 genome containing genetic information from both RNAs could be generated, yielding novel viral genomes. Recombination can also salvage genome damage or detrimental mutations in one RNA molecule by adopting genetic information from the intact template (
5,
38). Thus, recombination may not only introduce genetic diversity, it may also serve as a repair mechanism for the HIV-1 genome.
Accumulating evidence has confirmed the existence of recombinant HIV-1 in nature (
33). Phylogenetic analyses have detected recombinant HIV-1 by exploiting the substantial nucleotide sequence differences between different subtypes (
3,
12,
13,
27,
32,
36). Results indicate that approximately 10% of sequenced HIV-1 strains are mosaic genomes with genetic material from different subtypes (
34,
35). In addition, recombination between two highly divergent groups of HIV-1 has recently been identified (
40). Although the efficiency and the circumstances fostering the generation of recombination are largely unknown, the phylogenetic studies clearly indicate that recombination events contribute significantly to HIV-1 diversity. Further information regarding the rate, spectrum, and mechanism of HIV-1 recombination is needed for a thorough understanding of the HIV-1 recombination process.
A system employing two HIV-1-derived vectors confined to a single cycle of replication was used to examine the rate and mechanism of HIV-1 recombination throughout the entire genome using the heteroduplex tracking assay (HTA). The vectors were based on different strains of HIV-1 and were used to generate heterozygous virions. In order to closely reflect natural virus replication, these vectors were similar to the wild type in both size and sequence content. Sequence differences between the two strains of HIV-1 were exploited to monitor recombination events. The results from this study indicate that, on average, HIV-1 recombines approximately two to three times in every cycle of replication. This overall rate of recombination is similar to the rate previously observed during minus-strand synthesis (
49), indicating that recombination occurs mainly during minus-strand DNA synthesis. It also suggests that both virion RNAs are utilized during replication. Furthermore, a higher than average number of recombination events was observed in two regions of the genome, suggesting that sequence context can play a role in HIV-1 recombination.
RESULTS
Generation of heterozygous virions.
Recombination events may be observed from progeny of heterozygous virions after a single cycle of virus replication. Two different HIV-1 vectors were used to generate heterozygous vector virions. HIV-gpt
HXB2 and HIV-puro
BCSG3 are
env-defective HIV-1 vectors (Fig.
1) and are based on the HXB2 and BCSG3 strains of HIV-1, respectively (
29,
49). The sequence difference between the HXB2 and BCSG3 strains of HIV-1 is approximately 5%.
The protocol employed to measure HIV-1 recombination is outlined in Fig.
2. The approach is predicated upon confining replication to a single cycle because the provirus in the target cell provides a “fossil record” allowing one to deduce conclusions about dynamic events that occurred during reverse transcription. Eight independent producer cell clones containing single copies of HIV-gpt
HXB2 and HIV-puro
BCSG3 vector proviruses were established as previously described (
49). Upon removal of tetracycline, producer cell clones were induced to express the HIV-1 Env protein and infectious vector virus was produced. Supernatant containing vector virus was used to inoculate CD4-positive HeLaT4 cells. Since the producer cells do not express CD4, vector virus cannot spread among them. Since the CD4-positive HeLaT4 cells do not express the HIV-1 Env protein, the vector virus cannot further propagate after infection. Therefore, from the producer cell to the target cell, the vector virus is restricted to a single cycle of replication.
Only progeny derived from heterozygous virions are informative for the study of recombination, since crossovers between homozygous genomes cannot be scored. The puromycin-resistant titers were typically 100-fold lower than the GPT-resistant titers, so that the proviruses in puromycin-resistant target cells are likely to be progeny of heterozygous virions, in that the difference in titer is likely to reflect a difference in the amount of RNA transcribed and packaged. This prediction was subsequently borne out, since all of the proviral clones analyzed were shown to be recombinants (see below). From 8 independent producer cell clones, 86 puromycin-resistant target cell clones were established and analyzed for recombination. These 86 clones were sensitive to GPT selection, thus ruling out the possibility of double infection.
Detection of recombinants by HTA.
HTA was used to screen the progeny proviruses for sequence changes caused by recombination (
51). Based on the mobility differences between homoduplex and heteroduplex DNAs upon electrophoresis, HTA can differentiate sequence heterogeneity. In HTA, a
32P-labeled single-stranded probe is amplified first from wild-type plasmid DNA. This probe is then annealed to both wild-type and mutant DNAs. Upon annealing, a homoduplex forms between the probe and the wild-type DNA whereas sequence differences between the probe and the mutant DNA cause heteroduplex formation. These DNA complexes, when subjected to electrophoresis on a nondenaturing polyacrylamide gel, exhibit different mobilities. Wild-type and mutant DNAs can therefore be differentiated. The HTA is capable of detecting sequence divergence when the degree of mismatch exceeds approximately 1% (
9,
51). Since the sequence heterogeneity between the HXB2 and BCSG3 strains of HIV-1 is approximately 5% throughout the viral genome, the HTA can be utilized to detect crossovers between HIV-gpt
HXB2 and HIV-puro
BCSG3 that introduce >1% mismatch.
To carry out HTA screening, primers common to both HIV-gpt
HXB2 and HIV-puro
BCSG3 were designed to amplify segments of provirus DNA. Figure
1 shows the distribution of primers over the provirus genomes. The size of each amplified DNA fragment ranged from 612 to 865 bp, and the sequence differences between HIV-gpt
HXB2 and HIV-puro
BCSG3 in each DNA segment ranged from 3.2 to 6.2%. Figure
3 shows a typical HTA gel. A
32P-labeled single-stranded DNA probe corresponding to HIV-puro
BCSG3 nucleotides 7168 to 7987 (according to the BCSG3 RNA sequence) (
14) was prepared by asymmetric PCR (Fig.
3, lane P). The same primer pair was also used in a standard PCR to amplify provirus DNA from different progeny cell clones and from both HIV-gpt
HXB2 and HIV-puro
BCSG3 plasmids. After the probe had been annealed to the amplified DNA, homoduplex and heteroduplex DNAs were distinguished through gel electrophoresis. The probe formed a homoduplex when annealed to HIV-puro
BCSG3plasmid DNA (Fig.
3, lane S), whereas it formed a heteroduplex when annealed to HIV-gpt
HXB2 plasmid DNA (Fig.
3, lane H). DNAs from progeny cell clones 1, 2, 5, 7 to 11, 13, and 15 formed homoduplexes when annealed to the probe, indicating no sequence change or less than 1% sequence mismatch. On the other hand, DNAs from clones 3, 4, 6, 12, and 14 formed heteroduplexes when annealed to the probe, indicating more than 1% mismatch and divergence from the original HIV-puro
BCSG3 sequence. By the same approach, sequence changes were identified in the other segments of proviruses of the 86 progeny cell clones.
To confirm that the heteroduplex shifts observed in the HTA analysis were produced by recombination and not mutation, several different fragments from several different clones were subjected to sequencing analysis. A total of 47 DNA fragments were sequenced. Fifteen of the 21 DNA fragments that had caused heteroduplex formation in HTA displayed an HIV-gptHXB2 sequence pattern, while the remaining 6 DNA fragments had part of the sequence from HIV-puroBCSG3 and part from HIV-gptHXB2 (data not shown), allowing the crossover regions to be identified in them. Furthermore, 26 of the DNA fragments that had formed homoduplexes in the HTA showed an HIV-puroBCSG3 pattern (data not shown). Although it could not be ruled out that some gel shifts were caused by insertions, deletions, or mutations, sequencing analysis clearly indicated that the majority of DNA fragments that caused a gel shift had part or all of their sequences changed from HIV-puroBCSG3 to HIV-gptHXB2 as a result of recombination.
Recombination occurs at a high rate during virus replication.
To assess recombination over the entire provirus genome, sequence information obtained by HTA for the 10 provirus segments (Fig.
1) and by restriction enzyme analysis for the 5′ and 3′ long terminal repeats (LTRs) (
49) were combined. The results for 86 provirus clones are summarized in Table
1. The average number of recombination events that occurred was three crossovers per genome per replication cycle. Although the majority of progeny proviruses experienced two to three recombination events, the number of crossovers per progeny provirus ranged from one to seven (Table
2).
Sequence context may play a role in HIV-1 recombination.
To evaluate the effect of sequence context upon the generation of HIV-1 recombination, the frequency of recombination events that occurred in different DNA segments was compared and plotted in Fig.
4. Recombinants can be identified when two consecutive DNA segments show different sequence origins, as ascertained by HTA. The recombination event might have occurred in the DNA segment that caused heteroduplex formation in HTA and caused more than a 1% sequence difference from the original HIV-puro
BCSG3 sequence. Alternatively, the crossover point might have occurred in the adjacent DNA segment that displayed homoduplex formation in the HTA and introduced less than a 1% sequence difference from the original HIV-puro
BCSG3 sequence.
Although recombinations were observed across the entire HIV-1 provirus genome, a high crossover rate was observed in two regions; one region is between segment 8 and the SNV-puro expression cassette, and the other is in the HIV-1 pol region. Eighty of 86 clones recombined between segment 8 and the exogenous SNV-puro sequences. DNA sequencing of HIV-puroBCSG3 revealed a 111-bp region between the SNV U3 promoter and the puromycin resistance marker gene (puro) that is homologous to the 3′ end of the simian virus 40 (SV40) early promoter in HIV-gptHXB2. This region includes the AT-rich region that contains the TATA box-like element. Sequencing analysis of six recombinant clones revealed that all had crossed over in that region and used the SV40 early promoter from the SV-gpt expression cassette to drive puro expression. Thus, there was a high rate of recombination between sequences of nonretroviral origin in this region. The other region of the genome that seemed to be a hot spot for crossovers was at the 5′ end of thepol region. Thirty-four of 86 clones recombined in this region. Twenty-one of these clones were sequenced such that the crossover regions could be visualized. The results showed that these 21 clones had not crossed over at exactly the same spot but that the crossover regions spanned both segments 3 and 4.
To determine whether the observed hot spot in nonretroviral sequences significantly affects the overall recombination rate, the sequence containing the hot spot was deleted and the analysis was repeated. For this purpose, HIV-puroNL4-3 was constructed. It is alsoenv defective with the deleted sequences replaced with an SNV-puro expression cassette which does not contain the nonretroviral crossover hot spot. Moreover, HIV-puroNL4-3 was derived from HIV-1 strain NL4-3, allowing measurement of recombination rates with a different viral strain. HIV-puroNL4-3 was used in conjunction with HIV-gptHXB2. The sequence difference between the HXB2 and NL4-3 strains of HIV-1 is approximately 3%, providing sufficient heterogeneity for HTA analysis.
Following the protocol described earlier and previously outlined (
49), producer cell clones containing a single copy of both HIV-gpt
HXB2 and HIV-puro
NL4-3 were isolated. Since the puromycin-resistant titers were again significantly lower than the GPT-resistant titers after one cycle of replication, puromycin-resistant target cell clones were isolated and analyzed by HTA (data not shown). The 1,473-bp viral sequence immediately upstream of the marker cassette for each of 12 progeny proviral clones was analyzed. The results indicated that five crossovers had occurred, yielding a recombination rate of 2.8 × 10
−4/bp per replication cycle. This rate corresponds to approximately 2.8 crossovers per genome per replication cycle.
DISCUSSION
In this report, we describe a system to examine the rate of HIV-1 recombination across the entire HIV-1 genome after a single cycle of viral replication. The vectors utilized were similar in size and sequence content to the authentic HIV-1 genome. The recombination rate obtained was approximately two to three crossovers per genome per replication cycle. Crossovers were identified throughout the viral genome.
A single-cycle approach has been used previously to study retroviral recombination in the avian oncoretrovirus SNV. The recombination rate obtained for HIV-1, a primate lentivirus, is approximately 5- to 15-fold higher than that obtained for SNV (
16,
17,
21). Although experimental differences between these studies may be a factor and accessory proteins may influence recombination, we believe that the likeliest reason for the discrepancy is inherent differences between the HIV-1 and SNV RTs.
It is also noteworthy that a positive correlation was previously noted between intermolecular minus-strand primer transfers and recombination for SNV (
21). However, this does not seem to be the case for HIV-1. The nature of HIV-1 minus-strand primer transfers, that is, whether they are inter- and/or intramolecular, for the same 86 proviral clones was analyzed in a previous study (
49). Of the 260 recombination events, 143 (55%) occurred after intramolecular minus-strand transfer while 117 (45%) occurred after intermolecular minus-strand transfer. Thus, there does not appear to be a strong correlation between the nature of the minus-strand primer transfer and the rate of recombination. We believe that this again reflects an intrinsic difference between the HIV-1 and SNV RTs.
Two regions exhibited higher-than-average recombination rates. A particularly strong hot spot was the nonretroviral SV40 sequence that was part of the expression cassettes. This hot spot contains an 89-base palindrome that can theoretically form a stable hairpin structure, as predicted by the Genetics Computer Group program FoldRNA (
10,
52). This might induce RT to pause, promoting recombination at the palindrome. Hairpin structures have been shown to promote strand transfer in a cell-free system (
23). The 5′ end of the
pol region also seemed to be a relatively strong hot spot for recombination. Nearly 40% of the proviral clones had crossed over in this region, which spans approximately 1 kbp of the viral sequence. Further study is needed to ascertain whether RNA secondary structure affected recombination in this region.
The results show that the presence of a nonviral hot spot did not affect the overall recombination rate in other parts of the genome. The calculated recombination rate, including the nonviral hot spot, is 3 × 10
−4/bp per replication cycle, whereas omission of the hot spot crossover events from the calculation produces a rate of 2.4 × 10
−4 bp per cycle. Deletion of this hot spot sequence from the vector significantly reduced the recombination rate in that region, suggesting that sequence context and RNA secondary structure can affect recombination, which is not surprising in light of previous reports suggesting that they can affect the frequency of mutation by RT (
21,
30,
31). However, although the frequency of recombination decreased in this segment, recombination events did occur at a rate of 2.8 crossovers per genome per cycle. Thus, even in the absence of this hot spot, the recombination rate remained extremely high. In addition, the six clones that had not recombined at the hot spot retained the SNV U3 promoter and experienced crossover events at a rate of 2.3 per genome per replication cycle, providing further evidence that recombination at the nonviral hot spot is not a prerequisite for a high rate of recombination.
Two models of retroviral recombination have been proposed (
41). The strand displacement assimilation model proposes that DNA fragments are displaced during plus-strand DNA synthesis from one template and subsequently assimilated by the plus-strand DNA synthesized from the other template, causing recombination during plus-strand synthesis (
20,
22). The copy choice model proposes that recombination occurs during minus-strand DNA synthesis. The original “forced copy choice” model hypothesizes that RT switches templates when it encounters RNA breaks, promoting recombination during minus-strand DNA synthesis (
6,
44). The copy choice model has recently been broadened to include recombination that occurs during minus-strand synthesis but without the requirement of breaks in viral RNA. This model suggests that the low processivity of RT causes the enzyme to dissociate from the template, allowing the short DNA-RNA hybrid to be disrupted so that the growing DNA strand is displaced to the other RNA template (
7,
48).
The results presented here and previously provide evidence that HIV-1 recombination occurs mainly during minus-strand DNA synthesis, supporting a copy choice model as the predominant mechanism of recombination. We heretofore reported that the rate of recombination in U3 of the viral LTR was 3 × 10
−4/bp per replication cycle, which when extrapolated to the entire genome, indicates a rate of approximately three crossovers per genome per replication cycle. Because the rate was obtained for the viral LTRs, it was possible to determine whether the crossovers occurred during minus-strand or plus-strand synthesis. The results indicated that recombination occurred primarily during minus-strand DNA synthesis (6:1 minus-strand-to-plus-strand ratio) (
49). The fact that the rate obtained during minus-strand synthesis in U3 is similar to that obtained for the entire genome provides support for the theory that the majority of crossovers throughout the genome occur during minus-strand DNA synthesis. Previous studies using SNV-based vectors have also indicated that retroviral recombination occurs during minus-strand DNA synthesis (
2,
18). Moreover, given that the rate of recombination during minus-strand synthesis is high at least early during reverse transcription, the frequency of recombination events should be further skewed toward minus-strand DNA synthesis, particularly late during reverse transcription. If a crossover occurs during minus-strand synthesis, then RNase H activity degrades the second RNA template, preventing further synthesis of a second minus-strand DNA, obviating further recombination during plus-strand synthesis because of the lack of a second template. Furthermore, given the high rate of recombination, the forced copy choice mechanism would require that the majority of virion RNAs have multiple breaks. This is possible, but given the low degree of processivity exhibited by HIV-1 RT (
19), it seems more plausible that crossovers can occur without RT encountering strand breaks. Taken together, these results suggest that HIV-1 recombination occurs primarily during minus-strand synthesis via a simple copy choice mechanism.
The high rate of recombination, taken together with the previous finding that 50% of all minus-strand primer transfers are intermolecular (
43,
47,
49), implies that both HIV-1 RNAs are typically utilized during reverse transcription. Nevertheless, this does not exclude the possibility that a single virion RNA can act as the sole template for reverse transcription, as was previously reported for SNV (
21). The high capacity of HIV-1 RT to generate recombinant progeny proviruses seems to support the contention that the reason retroviruses are diploid is to provide a recombination partner (
42), implying that there are strong selective pressures promoting recombination. This indicates that recombination is an integral aspect of HIV-1 replication and is likely to play an important role in the generation of viral diversity.