INTRODUCTION
A prominent feature of human immunodeficiency virus type 1 (HIV-1) is the genetic breadth and plasticity of its populations (
66). Early molecular epidemiology studies revealed several distinct genetic lineages, now called subtypes, within the main group of HIV-1. The discrete albeit overlapping geographic distribution of these subtypes suggested that much of their differences arose via gradual mutagenesis over time, after the introduction of HIV-1 into humans (
210). These subtypes are given alphabetic designations such as subtype A, B, or C and are joined in the pandemic today by a few dozen additional strains that contain interwoven genetic segments derived from multiple earlier-recognized subtypes. Because these recombinants spread among patients, they have been designated circulating recombinant forms (CRFs), and emerging evidence suggests that some historically defined subtypes may themselves be CRFs (
3). In addition to CRFs, unique recombinant forms far too numerous to cite have been isolated from individual patients (
207).
Both recombination and point mutations contribute to the genetic variation in HIV-1 populations. Base substitutions are introduced principally by error-prone DNA synthesis (
263) or by the activities of host antiviral factors such as APOBEC3 family cytidine deaminases (
56). These processes introduce roughly 1 substitution per viral genome per generation. Thus, point mutation rates alone are sufficient to explain why retroviruses like HIV-1 exist as snowflake-like quasispecies in which nearly every virus in a population differs from every other one.
Point mutations accumulate fairly steadily over generations and thus can be used to clock viral strain divergence (
185). Although selection dictates that variation is not constant across HIV-1 genomes (
66), the density of accumulated mutations can be used to determine if a given viral isolate has undergone extensive rounds of replication or has recently been reactivated from a long-established provirus.
In contrast to the clock-like accumulation of genetic change introduced by point mutations, recombination can reset the clock by scrambling genetic content. This can lead to beneficial combinations of mutations, the loss of deleterious mutations, or new starting points for subsequent viral evolution. Whenever clustered substitutions are observed, the variation arose more likely via recombination than by serial point mutations (
198,
338). (Hypermutation may be an exception to this rule, although whether such mutations imbedded in less altered genome regions result more frequently from recombination or from the limited processivity of mutagenic factors remains unclear [
48,
222,
236].) Some instances of phenotypic switch, including coreceptor switch and reacquisition of drug resistance, have been linked to mutations embedded in localized sequences that differ significantly from flanking sequences, thus providing evidence for recombination within individual patients' virus populations (
118,
212,
235,
271).
The propensity of retroviruses to undergo recombination was recognized long before HIV-1 was identified as being the causative agent of AIDS, and thus, evidence for HIV-1 recombination—which was noted as soon as intact viral genomes were sequenced—was not surprising (
65). In the 1970s, work with animal retroviruses revealed that markers reassorted so readily that they appeared unlinked (
334,
349). Due to the fact that the retroviral genome is a single RNA and, thus, genes cannot physically reassort, this suggested that retroviruses had evolved to recombine their physically linked genes at an unprecedentedly high rate.
Early experiments addressing whether or not HIV-1 could recombine confirmed that recombination was readily detectable. For example, one mutant's stop codon was rescued by recombination with a different defective HIV-1 in tissue culture, and recombination also leads to the cosegregation of drug resistance mutations (
60,
163,
220). Because these experiments provided strong selection for recombinants, they could not rule out the possibility that recombination was rare. However, when cultured cells were experimentally coinfected with two distinct strains with similar fitnesses, more than 20% of the proviral population was found to be recombinant, suggesting that recombination was exceptionally frequent (
178). Simian immunodeficiency virus (SIV) recombination was readily detected in experimentally coinfected monkeys, demonstrating that recombination of HIV-like lentiviruses also occurs in vivo (
100,
347).
HIV-1 recombination does not involve nucleic acid breakage and rejoining but instead results from reverse transcriptase (RT) template switching between viral RNAs during provirus synthesis. Two fundamental properties of retroviruses are critical to their high frequency of recombination. The first is that retroviral genomic RNAs (gRNAs) are encapsidated in pairs. Upon infection of a new cell, the proximity of the two gRNAs facilitates template switching that is orders of magnitude more frequent than that for other viruses. Despite harboring two complete gRNAs per particle, retroviruses are not truly diploid and are best described as being “pseudodiploid” (Fig.
1). This is because only one or fewer DNAs is synthesized per virion, and thus, only one allele at each locus is passed on in the progeny DNA. Part of the reason that no more than one DNA is made per viral particle is stochastic: probably less than 1% of all virions generate infectious proviruses, and thus, the probability of generating two is <0.01%. Furthermore, template switching during minus-strand synthesis all but precludes the generation of more than one DNA per virion due to RNase H degradation of template segments. In this same manner, the high frequency of HIV-1 recombinogenic template switching effectively limits the number of DNAs generated per gRNA dimer to 1 (
368).
The second property of retroviruses that is critical to their unusually high recombination frequency is their recombination-prone replication machinery. It was hypothesized that retroviruses are prone to recombinogenic template switching because of the need to perform two mechanistically similar replicative template switches during every round of viral DNA synthesis (
68,
320) (Fig.
2). Retroviral genomes are composed of single-stranded RNAs, designated “plus-strand” (or “sense-strand”) RNAs because they contain open reading frames that are recognizable by host ribosomes. The first DNA intermediates synthesized are thus minus stranded, or antisense. The generation of a retroviral DNA is not so simple as the copying of plus-sense RNA into minus-strand DNA, followed by the synthesis of a plus-strand complement. Instead, two replicative template switches, also known as strong-stop strand transfers or “jumps,” join and duplicate sequences found only once in gRNA to reconstitute the long terminal repeats at the boundaries of preintegrative DNA (
110,
318).
In contrast to strong-stop switches, which occur almost exclusively at defined positions, recombinogenic template switching may occur from any position in the retroviral genome (
11). In its simplest form, retroviral recombination involves copying part of one gRNA, followed by RT switching to a homologous region on a copackaged gRNA to complete viral DNA synthesis (
64). This can lead to recombinant genomes if the copackaged RNAs contain allelic differences. In the example shown in Fig.
3, one parental gRNA contains a protease (PR) allele with mutations that confer broad resistance to protease inhibitors, while the second genome contains RT sequences that confer zidovudine (AZT) resistance. Because both high-level resistance to AZT and cross-resistance to protease inhibitors require multiple alterations, developing either one can take many virus generations and extensive mutagenic exploration of the fitness landscape (
42,
107). In contrast, once each resistance allele has developed independently, recombination permits the cosegregation of both traits in a single cycle of replication.
In this review, the term “genetic recombination” is used to describe the reassortment of viral genome regions. However, the integration of a provirus is arguably the ultimate form of retrovirus-mediated genetic recombination. Also, as discussed below, HIV-1 integration provides a means of persistence and access to host genetic material that can influence recombination outcomes. Although very infrequent, host-mediated recombination among integrated retroviruses and related elements can also occur and can contribute to host evolution (
18,
122,
140).