Coronavirus (CoV) phylogeny and biology, as demonstrated during the severe acute respiratory syndrome (SARS) epidemic in 2002-2003, are likely characterized by frequent host-shifting events, whether they be animal-to-human (zoonosis), human-to-animal (reverse zoonosis), or animal-to-animal (
26,
44,
67,
115). Over the past 30 years, several coronavirus cross-species transmission events, as well as changes in virus tropism, have given rise to significant new animal and human diseases that implicate bovine coronavirus (BCoV), human coronavirus OC43 (HCoV-OC43), human coronavirus 229E (HCoV-229E), canine coronavirus (CCoV), feline coronavirus (FCoV), porcine coronavirus (PCoV), and transmissible gastroenteritis virus (TGEV) (
1,
58,
79,
80,
103,
104,
143,
144). Most notably, severe acute respiratory syndrome (SARS), a lower respiratory tract disease of humans that was first reported in late 2002 in Guangdong Province, China, quickly spread worldwide over a period of 4 months spanning late 2002 and early 2003 and infected over 8,000 individuals, killing nearly 800 before it was successfully contained by aggressive public health intervention strategies (
25,
69,
101,
102,
160). A coronavirus (SARS-CoV) was identified as the etiological agent of SARS, and assessments determined that the virus crossed to human hosts, most likely in southern China in Guangdong Province, from zoonotic reservoirs, including bats (
74), Himalayan palm civets (
Paguma larvata), and raccoon dogs (
Nyctereutes procyonoides), the latter two of which are sold in exotic animal markets (
44). In this review, we discuss the pleiotropic molecular mechanisms that govern coronavirus cross-species transmission both
in vitro and
in vivo, paying particular attention to SARS-CoV and SARS-like-CoV transmission events as models, comparing and contrasting the diversity of mechanisms governing virus cross-species transmission in outbreak settings.
Coronaviruses are enveloped RNA viruses that infect and cause disease in a broad array of avian and mammal species, including humans. They contain the largest single-stranded, positive-sense RNA genomes currently known, ranging in size from 27 to nearly 32 kb in length. SARS-CoV, at 29 kb, encodes nine open reading frames (ORFs) (
20,
84,
115). While all CoVs carry strain-specific accessory genes in their downstream ORFs, the order of essential genes—the replicase/transcriptase gene (gene 1), Spike gene (gene 2 in SARS-CoV), envelope gene (gene 4), membrane gene (gene 5), and nucleocapsid gene (gene 9)—is remarkably conserved (Fig.
1). Within the virion, genome single-stranded RNA (ssRNA) is encased in a helical nucleocapsid composed of many copies of the nucleocapsid (N) protein. The lipid bilayer envelope contains three proteins, envelope (E) and membrane (M), which coordinate virion assembly and release, and the large peplomer, S. Multiple copies of the S glycoprotein decorate the surfaces of CoV virions, conferring the virus's characteristic corona shape. S also serves as the principle mediator of host cell attachment and entry, utilizing virus- and host-specific cell receptors. For SARS-CoV, the angiotensin 1-converting enzyme 2 (ACE2) molecule has been shown to serve as a receptor (
73); CD209L has been implicated as a coreceptor in entry (
57). Receptor usage, as well as binding of other molecules, varies by group and even by strain among the coronaviruses (Table
1) (
31,
34,
43,
48,
57,
73,
83,
85,
109,
118,
137,
145,
148,
155); however, in the majority of studies to date, S—in particular the receptor binding domain (RBD) of S—remains the principal player in determining host range (
10,
30,
110,
117,
134,
135,
138).
Prior to the identification of SARS-CoV, coronavirus disease in humans was reported to result in mild upper respiratory tract illnesses caused by the two known pathogenic human coronaviruses (HCoVs), HCoV-229E and HCoV-OC43 (
139), although recent studies have revealed more-serious lower respiratory tract illness, including lethal disease in the elderly (
99). Subsequent to the SARS epidemic, other coronaviruses capable of causing disease in humans, HCoV-NL63 and HCoV-HKU1, were identified from archived nasopharyngeal aspirates (
140,
151). Infections with these viruses are associated with more-serious lower respiratory tract infections in infants, children, and adults, including croup, bronchiolitis, and pneumonia, though the true burden of the disease, especially in the very young, is not currently known (
131). Increased awareness of pathogenic human coronaviruses led to an escalation in research regarding their persistence in reservoir hosts, the molecular mechanisms governing their emergence and pathogenesis in the human population, and the factors required for successful vaccine and therapeutic interventions. These research pursuits are of particular merit when considered alongside the increasing awareness that coronaviruses can apparently breach cell type, tissue, and host species barriers with relative ease (
1,
6,
22,
58,
79-
81,
97,
103,
120,
121,
125,
143,
144).
This review summarizes the structure and function of the type I fusion protein S, which mediates docking and entry into cells, speculating on how shuffling various S moieties between virus strains and groups may lead to host range expansion; investigates other alleles that may govern coronavirus cross-species transmission in cell culture and in vivo; discusses the possible molecular mechanisms governing the migration of SARS-CoV from zoonotic reservoirs into the human population; visits the concept of viral persistence as a mechanism for host range expansion; and explores receptor-independent entry as an alternative pathway for cross-species transmission. In this context, we will employ SARS-CoV as a model to outline the current state of knowledge regarding the molecular determinants of species-specific receptor engagement. The routine nature of viral cross-species transmission in the coronavirus family brings up the question of the likelihood of another emergence event of a pathogenic human coronavirus and underscores the need to continue zoonotic surveillance and research centered around developing therapeutics and vaccines capable of neutralizing or preventing infection with and spread of these promiscuous viruses.
THE SPIKE GLYCOPROTEIN: SHUFFLING MOIETIES WITHIN A CLASS I FUSION PROTEIN
At ∼180 kDa in mass and visible in electron micrographs as a 20-nm projection from the virion surface, the S glycoprotein is second only to the replicase protein nonstructural protein 3 (nsp3) as the largest mature protein produced during coronavirus infections. SARS S glycoprotein, which forms a trimer in the virion, is organized into two subunit domains, an amino-terminal S1 domain, which contains the ∼200-amino-acid (aa) RBD, and a carboxy-terminal S2 domain, which contains the putative fusion peptide, two heptad repeat (HR) domains, and a transmembrane (TM) domain. This domain organization groups the CoV Spike protein with other class I viral fusion proteins, such as influenza virus hemagglutinin (HA), HIV-1 Env, simian virus 5 (SV5) F, and Ebola virus Gp2 (
16) (Fig.
2).
The RBD of the Spike protein is generally acknowledged as the principal determinant of host range (
30,
110,
117,
134,
135,
138) and will be described in further detail later in this review. Recent studies have implicated moieties within the CoV S2 region in host range expansion. A murine hepatitis virus (MHV) variant isolated from a persistent infection of murine astrocytoma delayed brain tumor (DBT) cells evidenced expanded usage of the human carcinoembryonic antigen-related cellular adhesion molecule (hCEACAM) rather than the murine CEACAM (mCEACAM) as a receptor. When the variant was sequenced, of the 13 mutations identified in the Spike-coding region, 6 were in the S2 subunit. Of those six, two were in the fusion peptide and two in heptad repeat 1 (HR1). Interestingly, when the S2 mutations were introduced in the wild-type (wt) MHV background, mCEACAM-mediated infectivity was severely hampered. Conversely, the mutations identified in the S1 domain did not substantially alter infectivity (
89); rather, combinations of four S2 residue alterations mediated host range expansion. In another study, it was demonstrated that paired mutations in the HR1 domain and fusion peptide of a heparan sulfate binding variant of MHV were sufficient to abolish mCEACAM dependence, effectively extending host range (
30).
Other viruses encoding class I fusion proteins have exhibited alterations in host range following mutation of their fusion subunits. An antiviral escape mutant of the retrovirus avian sarcoma and leukosis virus (ASLV) containing mutations in the HR1 region of the envelope TM subunit (analogous to the CoV S2 subunit) gained the ability to infect nonavian cells (
2). Interestingly, in three separate analyses of experimental evolution, H3-type influenza viruses selected for similarly mapping mutations in the globular bases of their HA2 subunits when adapting the human H3N2 virus to mice (
61), as well as when analyzing the avian progenitor's (H3N8) leap to humans (
45) and adaptation to dogs (
100) (Fig.
2). Although identification of the mechanism remains uncertain, mutations in and around the heptad repeats and fusion cores of viruses encoding class I fusion glycoproteins potentially represent an underappreciated yet conserved pathway for virus cross-species transmission.
Recent work suggests, in fact, that the separate moieties of Spike, including both the S1 and S2 subunits, may possess a degree of interchangeability that could influence host range. An elegant coronavirus reverse genetics system that has proven especially efficient in introducing mutations in CoV genome regions 3′ of ORF1 depends on the tropism-altering interchangeability of the Spike ectodomain and the intrinsic facility of coronavirus-targeted recombination (
29). Further, the locations of receptor binding domains in other coronaviruses hint at modularity; the MHV RBD is located at the very N terminus of the S1 domain, whereas the 229E RBD is located at the C terminus of S1 (
15,
18,
63,
75), suggesting the possibility that these domains were acquired by distinct, disparate recombination events. Additionally, in our recent reconstruction of the bat SARS-like CoV (Bat-SCoV), we were able not only to replace the RBD of Bat-SCoV with the human equivalent in order to generate infectious progeny but also to generate a recombinant human virus (Bat-F) in which the 3′ 5,700 nucleotides (nt), including the S-coding sequence 3′ of the RBD, were replaced with those from the Bat-SCoV sequence (
10). Both mutants were infectious in primate cells, suggesting an as-yet-undefined plasticity and perhaps a modular design in the Spike protein-coding sequence that allows for robust interchange of component parts. In particular, substitutions of entire functional cassettes of S1 and S2 may play pivotal roles in mediating CoV host range expansion, and this trend may extend to other viral class I fusion proteins as well. Additional research is needed to illuminate the fundamental mechanisms governing S2-mediated host range expansion both
in vitro and
in vivo as well as the phylogenetic constraints on S domain interchangeability. Defining this aspect of CoV genetics will contribute to our understanding of viral phylogeny, may help better predict the emergence of new strains, and could facilitate the design of cross-strain therapeutic reagents.
ALLELIC REQUIREMENTS FOR SARS-CoV SPECIES SPECIFICITY: THE SPIKE RBD
Molecular evolution during the 2002-2003 outbreak and the subsequent mutational analyses of animal and human SARS-CoV strains revealed the presence of key mutational hotspots. Between Bat-SCoVs and civet and human SARS-CoVs, regions of high mutation include those of nsp3, a cleavage product from the ORF1a polyprotein, Spike, ORF3, and ORF8 (Fig.
1) (
26). When multiple isolates of civet and human SARS-CoVs were compared in detailed analyses, a key region likely to influence host range was identified, namely, the Spike RBD. Coronavirus Spike RBDs are virus specific, discrete, independently folded regions responsible for interfacing with the viral receptor. Many RBDs have been described, including the RBD for SARS-CoV, whose structure in complex with human ACE2 has been solved (
5,
15,
63,
72,
150). These regions vary in size, though they are usually between 180 and 330 amino acids in length, and they vary in their positions in the Spike S1 domain (
75). Comparison of Spike RBD sequences from civets, as well as from early-phase and late-phase human infections, presents evidence that the RBD experienced an increase in the population frequency of fit alleles (positive selection) in civet and early-phase human isolates and a decrease in allelic diversity via selection against less fit or deleterious alleles (negative selection) in late-phase human isolates (
26,
52). Across the RBD, only 6-amino-acid residues differ between civet and human isolates (Fig.
3) (
26,
44,
59,
78,
102,
107,
115,
125,
147). Of these residues, four are located in the receptor-binding motif (RBM), the loop region of the RBD (residues 424 to 494 in human isolates) that contains 13 of the 14 residues that interface with ACE2 (T402 is N-terminal to the RBM). Of the four RBM residues, three are ACE2 interface residues (
72).
Surface plasmon resonance binding studies of four Spike S1 residues (344, 360, 479, and 487, the latter two of which are in the RBM of the RBD) demonstrated that (i) binding efficiency of an SZ3 (civet isolate) RBD-Ig to ACE2 is more than 30,000-fold less efficient, (ii) incorporation of either civet RBD residue 479 or 487 into the human RBD results in an approximately 20- to 30-fold decrease in binding efficiency, and (iii) incorporation of either civet residue 344 or 360 results in little-to-no loss of binding efficiency. Coordinately, incorporation of human residues 479 and 487 into a civet RBD Spike-pseudotyped virus enhanced infection of cells expressing human ACE2, while incorporation of civet residues 479 and 487 into a civet RBD Spike-pseudotyped virus abolished infection of cells expressing human ACE2 (
76,
107).
Notably, RBDs constructed from TOR2 (late-phase human, identical to Urbani in the RBD), GZ02 (early-phase human) and civet isolates all bound civet ACE2, while only human isolates bound human ACE2 (
50,
76). Paired with the observation that some civet RBD sequences utilize the human amino acid at both residues 479 and 487, it is reasonable to speculate that substitutions in the RBD that increased human ACE2 binding affinity occurred in the palm civet host. This speculation is strengthened by structure model studies demonstrating that stepwise substitution at residues 479 and 487 enhanced RBD-human ACE2 (hACE2) interaction in vitro, possibly by eliminating unfavorable charges in the RBD-receptor interface (
71). Interestingly, models predicted these changes would have no effect on civet ACE2 (cACE2) affinity.
The importance of proper RBD-ACE2 interfacing was demonstrated in our laboratory in a study in which SARS-CoVs expressing either a wild-type civet Spike or a mutated civet Spike containing the human residue at position 479 (icSZ16-S K479N) were constructed (
121). Although the parent SZ16 viruses were incapable of replicating in Vero cells or mouse cells expressing hACE2, icSZ16-K479N replicated poorly in Vero cells and was capable of recognizing the hACE2 as a receptor. Serial passage on human airway epithelial cells (HAEs) rapidly selected for evolved viruses, icSZ16-S K479 D8 and icSZ16-S K479 D22, which exhibited enhanced growth on HAEs and DBT-hACE2 cells. The D8 and D22 variants retained their mutations at residue 479, and while no changes at residue 487 were noted, two additional interface residues were altered, Y442F and L472F. Homology modeling studies of these variants suggested that incorporation of these variant residues resulted in the achievement of more-efficient RBD-hACE2 interactions but inefficient recognition of cACE2.
Other studies have further implicated the RBD and its critical ACE2 interface as the prime barrier to host infection for SARS-like coronaviruses. Bat-SCoV-Spike-expressing pseudotyped viruses were unable to infect cells expressing bat, civet, or human ACE2 receptors, while pseudotyped viruses expressing Bat-SCoV-Spike containing the human RBD were able to infect hACE2-expressing cells (
111). In our laboratory, full-length Bat-SCoV RNA was replication competent but not infectious when transfected into Vero cells (
10). However, as described above, replacement of the equivalent bat RBD residues (Spike amino acids 323 to 505) with the human RBD residues 319 to 518 in the context of the infectious cDNA (Bat-SRBD virus) was sufficient to restore infectivity in Vero cells, though virus with replacement of the RBM alone replicated but was not infectious. Remarkably, while this virus also replicated in the aged BALB/c
in vivo mouse model, incorporation of a single amino acid substitution, Y436H (Bat-SRBD-MA), previously shown to enhance replication and pathogenesis in mice (
113), also significantly enhanced replication of Bat-SRBD-MA in mice (
10). Homology modeling of the substitution against a predicted structure of mouse ACE2 (mACE2) indicated an enhanced interface of the chimeric RBD with the mACE2 receptor. Thus, clear evidence for SARS-CoV tracking along ACE2 receptor orthologs was established by these studies, especially between civet and human hosts. However, the receptor for Bat-SCoV in bats remains unclear. It is possible that the immediate progenitor for the SARS-CoV epidemic strain has not been identified; alternatively, recombination insertion of variant RBDs may have mediated the initial cross-species transmission event from bats into other mammals.
The significance of the Spike-ACE2 interface is also illustrated in neutralizing-antibody analyses and neutralization escape studies. When sera collected from 2002-2003 (epidemic) convalescent human patients and sera from civets captured in 2004 were assessed against Tor2 (mid-phase epidemic strain) and GD03 (late-phase isolate) infections, 2002-2003 sera more efficiently neutralized Tor2, and civet sera more efficiently neutralized GD03. Multiple neutralizing epitopes have been identified, and the majority of these residues lie within the RBD, specifically within the RBD-ACE2 interface (
114,
128,
136,
162). Interestingly, two separate studies detailed the identification of neutralization escape SARS-CoV mutants with compensatory changes in the RBD interface region that could be subsequently neutralized by synergistic application of antibodies binding noncompeting epitopes, one of which was in S2 (
91,
133), suggesting that the evolution of the RBD under the selective pressure of the antibody elicits both proximal and distal changes in Spike sequence and structure.
POLYMERASE ERROR RATE AND HOMOLOGOUS RECOMBINATION: A CONSIDERATION OF MOLECULAR MECHANISMS FOR ALTERING TISSUE AND SPECIES TROPISM
Consideration of the nature of the coronavirus polymerase and its replication strategy immediately suggests many possible molecular mechanisms these viruses might employ to alter cell, tissue, and species tropisms. The following paragraphs will discuss three major mechanisms that were either likely or possibly employed in SARS-CoV emergence in the human population as a model for the field: polymerase error rate, homologous recombination, and persistence.
As RNA viruses, coronaviruses encode an RNA-dependent RNA polymerase (RdRp) to catalyze the production of new viral RNA.
In vitro studies have estimated the error rates of similar polymerases at 10
−3 to 10
−5 mutations per nucleotide (nt) per replication cycle (
33,
49). It has been shown that coronaviruses encode a contingent of putative and confirmed RNA-processing and -editing enzymes that are speculated to increase the fidelity of the RdRp, presumably due to the unusually large sizes of coronavirus genomes (
11-
13,
24,
41,
53-
55,
60,
90,
119,
124). Importantly, abolition of the activity of one of these processing enzymes, the exonuclease N activity (ExoN) encoded within nsp14 of ORF1 in murine hepatitis virus (MHV), resulted in a loss of polymerase fidelity of almost 10-fold compared to that for RNA isolated from plaque-forming wild-type and mutant viruses (error rates of 2.5 × 10
−6 and 3.2 × 10
−5, respectively), suggesting that the intrinsic coronavirus RdRp fidelity, in the absence of RNA proofreading activities, is in the range of that determined for other RdRps
in vitro. SARS-CoV mutants lacking ExoN activity have exhibited similar results (L. D. Eckerle, M. M. Becker, R. L. Graham, R. S. Baric, and M. R. Denison, unpublished data). In addition, little is known about the influence of selective pressure, either negative or positive, upon the fidelity of the coronavirus polymerase complex.
In vitro, serial passage of MHV in progressively mixed cultures of nonpermissive and permissive cells resulted in the isolation of a variant with a disproportionate number of mutations in S2 and hemagglutinin esterase (HE), suggesting that passage environment influences rate and selection (
8). Molecular evolution studies comparing human isolates place the SARS-CoV RdRp mutation rate in the range of 10
−6 per nucleotide per replication cycle (
82,
141). Broader studies that incorporated animal isolates noted that the mutation rate slowed across the span of the epidemic but did not reach equilibrium, suggesting that fidelity may have relaxed in favor of adaptation to a new species of host (
26,
156); put another way, the greater selective pressures encountered during host species switches may have favored a lower level of RdRp fidelity.
Although the mechanism is unclear, these observations suggest that ExoN activity and overall RNA polymerase fidelity may be diminished in alternative host cell backgrounds and/or virus growth in periods of ecologic stress. Alterations of mutation rates, including site-specific mutation rates, in response to environment have been observed in multiple bacterial systems, including
Escherichia coli,
Haemophilus influenzae,
Neisseria meningitidis,
Helicobacter pylori, and
Staphylococcus aureus (
9,
35,
86,
87,
112,
146). Interestingly, for many of these examples, analysis of the emergence patterns of mutated isolates suggests the action of more-directed mechanisms than simply a stochastic selective process. Notably, with the exceptions of mutational hotspots in nsp3, Spike, ORF3, and ORF8, the majority of mutations between Bat-SCoVs and SARS-CoV isolates consist of point mutations (
67,
74,
110), some or most of which may have arisen simply from polymerase fidelity errors that were perpetuated as replication-neutral mutations; alternatively, some mutations may have arisen as a more directed response to altered selective pressures on the viral genome. The current data suggest that the effect of nsp14 ExoN function on polymerase fidelity should be evaluated in the context of cross-species transmission and disease emergence.
Analysis of the SARS-CoV genome yields clues that the virus may have employed mechanisms beyond fidelity error, however. Coronaviruses have demonstrated a marked capacity to employ homologous recombination, a process by which viruses exchange genetic material in the context of a coinfection (
65,
66). This process often takes advantage of the transcription regulatory network (TRN), a virus-specific series of 5- to 7-nt sequences (transcription regulatory sequences, or TRSs) situated at the 5′ end of each ORF that function to facilitate the incorporation of the viral leader sequence on subgenomic RNAs in the context of normal infection (
7,
65,
116,
157). Multiple lines of evidence implicate homologous recombination and host shifting in the phylogenetic history of SARS-CoV. An initial study immediately following the 2003 epidemic used Bayesian, neighbor-joining, and split decomposition analyses to determine that the SARS-CoV genome exhibited signs of a mosaic ancestry, with the 5′ end of the genome (the replicase/transcriptase gene) showing mammalian ancestry and the 3′ end (excluding Spike) showing avian ancestry. Although controversial (
42), analysis of the Spike gene showed evidence of a mosaic combination of mammalian and avian characteristics (
126), with a high level of identity to feline infectious peritonitis virus (FIPV), except for an ∼200-nt region from nt 2472 to nt 2694, which shows a higher level of identity with avian infectious bronchitis virus (IBV). Subsequent studies have substantiated and expanded upon this initial observation (
56,
93,
158). In fact, there is evidence of at least seven potential regions of recombination in the SARS-CoV genome in the replicase- and Spike-coding regions, with possible recombination partners that include porcine epidemic diarrhea virus (PEDV), transmissible gastroenteritis virus (TGEV), bovine coronavirus (BCoV), HCoV-229E, MHV, and IBV (
158). Of note, analysis of Bat-SCoV sequences has led to speculation that Bat-SCoV may have originated from a recombination event between the ORF1- and ORF2 (Spike)-coding sequences and that this recombination event may have occurred about 4 years before the SARS epidemic (
51). A similar study involving the human coronavirus HCoV-NL63 likewise demonstrated that HCoV-NL63 exhibited signs of having arisen from multiple recombination events from its nearest relative over the course of hundreds of years (
106). Further, a recent study identified a group 1 bat CoV that shared ancestry with HCoV-229E, which diverged about 200 years ago (
104). The results of these studies lead us to speculate that some, if not all, human CoVs may have diverged from bat ancestors. Efforts to gather empirical support for these bioinformatic studies are currently under way. While the exact phylogenetic origins and timeline of SARS-CoV emergence are as yet unknown, it seems clear from the available evidence that the genome that successfully infected humans may have been shaped in part by mutation and recombination events over an undetermined amount of time and in an as-yet-unidentified number of host species. Clearly, multiple empirical studies indicate that recombinant genomes are viable even across group 1 and 2 genealogies, especially in S (
8,
10,
88), and that recombination is shaping the population genetic structure of coronaviruses (
28) and likely influencing host range expansion.
VIRAL PERSISTENCE: EMPLOYING STEALTH AS A FACTOR FOR EXPANDING HOST RANGE
Many coronaviruses, including SARS-CoV, are accomplished at establishing and maintaining persistent infections
in vitro (
6,
21,
23,
97,
98,
154). In cell culture, persistent infections favor carrier cultures in which receptor expression is downregulated, selecting for the emergence of virus variants with mutations that alter either the affinity for the receptor or allow for recognition of new receptors for docking and entry into cells. Early MHV studies demonstrated that persistence resulted in a rapid accumulation of mutations in Spike, notably in the fusion core of the S2 domain, that were sufficient to alter cell type specificity or receptor affinity (
6). Based on these studies, we speculate that altered cell and tissue tropism following establishment of persistence may be followed subsequently by host range expansion. This phenomenon may be due to one mechanism or a combination of two mechanisms: homologue scanning and receptor/coreceptor shift. In the first mechanism, homologue scanning, gradual accumulation of mutations that enhance or alter Spike affinity for the receptor or homologues of the receptor in the persistent cell type, may foster increased affinity for an orthologous receptor molecule in a different species host. Such a model appears to have been employed in the evolution of group 1 coronaviruses (such as HCoV-229E), many of which employ corresponding orthologs of aminopeptidase N (APN) as cellular receptors (
138), as well as by zoonotic SARS-CoV in its recognition of human ACE2 receptors (
120,
121). Examples of other virus families that alter host range by recognition of receptor orthologs include henipaviruses (
17) and arenaviruses (
108), among others. In the second mechanism, receptor/coreceptor shift, gene acquisition, and/or mutation accumulation may drive the virus to recognize a completely different receptor or to require the additional recognition of a coreceptor for efficient cell entry. Examples of such include MHV strains that express the hemagglutinin esterase protein, which require sialic acid interactions in addition to CEACAM interactions to efficiently infect cells in vitro (
37). Additional data implicate cell surface molecules other than ACE2, such as DC-SIGN (CD209), L-SIGN (CD209L), and LSECtin (
43,
57,
85) (see Table
1), in cell engagement, and alterations in primary receptor affinity may enhance the requirement for receptor engagement cofactors for viral entry and thus effectively “switch” host receptor requirements and host range.
Both mechanistic possibilities are compelling when considering the evolution of the SARS-CoV receptor response. Civet and human orthologs of ACE2 have been shown to function as receptors for human and civet SARS-CoV isolates and are sufficient to confer permissiveness to nonpermissive cells, supporting the idea that coronaviruses often traffic along receptor orthologs during cross-species transmission (
73,
76,
120). However, a recent study demonstrated that expression of the
Rhinolophus pearsonii (Pearson's horseshoe bat) ACE2 ortholog did not allow either SARS-CoV- or Bat-SCoV-Spike-pseudotyped viruses to enter cells (
111). It has long been known that many species of bats serve as reservoirs for a variety of viruses without displaying clinical signs of infection, in effect existing in a state of persistent infection with the tenant virus (
130), and it is interesting to note that while antibody- and RNA-positive bats did not exhibit clinical signs of SARS-like disease, humans and civets infected with SARS-CoV developed distinct signs of infection (
62,
64,
67,
74,
153). The roles of the bat reservoir in the perpetuation and evolution of endemic and epidemic coronavirus infections, as well as receptor usage in bat populations, are and should continue as subjects of current study.
THE EMERGENCE OF SARS-CoV: ZOONOTIC CONDUITS AND RESERVOIRS
The emergence of SARS-CoV in the human population constitutes a prime real-world example of an RNA virus utilizing its molecular capabilities to alter its host range. Reports of the earliest cases of SARS in Guangdong involved employees of exotic meat markets in the province. Infected individuals tended to handle animals that were only recently captured from the wild and were consumed as delicacies by affluent individuals (
75,
122,
161). Subsequent analyses of nasal and fecal samples from wild-caught animals identified Himalayan palm civets (
Paguma larvata) and raccoon dogs (
Nyctereutes procyonoides) as potential reservoirs by both reverse transcription (RT)-PCR and immunoblotting (
44). Of the two candidate species, civets garnered special interest because of the capacity of SARS-CoV RNA to persist in infected animals for more than 2 weeks following initial infection (
153). Moreover, infections identified subsequent to the control of the primary SARS epidemic were associated with restaurants that prepared and served civet meat (
77,
125,
147), and culling of civets vastly reduced the numbers of infected animals in Guangdong marketplaces (
160).
However, multiple observations suggested that palm civets were simply conduits rather than the fundamental reservoirs of SARS-CoV-like viruses in the wild. RT-PCR studies comparing marketplace civets with civets in the wild determined that marketplace civets were disproportionately positive for viral RNA (
59). Also, comparisons of genome sequences from various civet isolates revealed ongoing mutation, suggesting that the virus was still adapting to the civet rather than persisting in equilibrium, as would be expected in a reservoir species (
59,
125). In fact, mutational analysis identified at least two separate transmission events that occurred between palm civets and humans: one during the main SARS epidemic in 2002-2003 and one during a series of sporadic infections that occurred in the winter of 2003-2004 (
125). Comparisons of human versus civet isolates revealed over 99.6% nucleotide identity (
122) (Fig.
1). Sequence analyses of human isolates from the late phase of the SARS epidemic indicated that negative selection was occurring in the Spike gene. However, calculations indicated that the Spike gene underwent positive selection during early civet-to-human transmission (
52,
156). Finally, analysis of samples taken from a healthy human cohort in Hong Kong in 2001 revealed the presence of antibodies against SARS-like viruses in 1.8% of the study population. Interestingly, most positive samples were positive to antibodies against animal isolates rather than human isolates. These observations suggest that substantial numbers of people may have been exposed to SARS-like viruses at least 2 years prior to the SARS epidemic (
159). Taken together, these observations suggest that palm civets did not serve as the primary reservoirs of SARS-CoV-like viruses from the 2002-2003 epidemic. Indeed, passage studies on HAE cultures of SARS-CoV isolates expressing civet ACE2 molecules selected for strains that recognized only the human receptor, leading to the hypothesis that the civet/human transmission cycle had been selected over several years, while the virus pool in both populations was maintained (
121). This finding further implicates a common progenitor reservoir that was neither human nor civet.
In 2005, two groups independently reported the identification of SARS-CoV-like RNA sequences and anti-SARS nucleocapsid antibodies in an Old World species of horseshoe bats in the genus
Rhinolophus, with especially high combined antibody/RNA prevalences in
Rhinolophus sinicus and
Rhinolophus macrotis (
67,
74,
122). Interestingly, high titers of antibodies correlated with low levels of RNA, suggesting that the viruses were actively replicating in these animals.
While neither team was able to successfully cultivate virus from bat samples, sequencing efforts netted full-length genomes from all three of the sampling locations that yielded positive samples. Bat SARS-like-CoVs (Bat-SCoVs) range in genome size from 29,690 to 29,749 nt, making them similar in genome length to SARS-CoV (29,727 nt), with nucleotide identity ranging from 88 to 92% compared to that for SARS-CoV (
67,
74,
110). While most gene sequences shared high identity (80 to 100%, with most genes in the range of 90 to 100%), distinct regions within nsp3, Spike (particularly the S1 domain), ORF3, and ORF8 were the most variable (see Fig.
1). Variations in these regions consisted of point mutations, deletions, and insertions of both small and large regions of sequence. These variations place Spike identity at 76 to 78% (63% for the S1 domain) and ORF8 identity at 34% compared to those for SARS-CoV. Of note, the 29-nt region in ORF8 present in palm civet and early human phase SARS-CoV isolates was also present in Bat-SCoV sequences (
26,
110). Analyses of nonsynonymous and synonymous substitution rates in the Bat-SCoVs indicate that these viruses have not undergone the positive selection pressure that would suggest a recent species-crossing event. Conversely, these analyses suggest that Bat-SCoVs have been evolving independently, presumably in bat hosts, for a long time (
110). Thus, these data suggest that Old World horseshoe bats such as those in the
Rhinolophus genus serve as reservoir species for SARS-like coronaviruses. It can also be speculated that similar species may harbor viruses with closer evolutionary relationships to the viruses that infected civets, raccoon dogs, and then humans in the SARS outbreak in 2002.
An examination of species-to-species conservation in the ACE2 molecule further complicates the evolutionary picture. Structural studies of hACE2 in complex with the RBD of SARS-CoV identified 18 key ACE2 residues across three regions of the protein that directly interface with the RBD (
72) (Fig.
4). While ACE2 molecules are well conserved on the whole across mammalian species (at or above 90% homology), homology across these interacting residues is not as well conserved. It is puzzling that the mouse ACE2 interface, which exhibits the lowest homology at these residues, supports infection with human epidemic strain SARS-CoV in an
in vivo model. Civet and bat interfaces possess similar homologies, yet while cACE2-transfected cells support both civet and human strain infections (
120), neither cACE2- nor hACE2-transfected cells support infection of Bat-SCoV unless it encodes the human RBD (hRBD) (
10 and unpublished data). In fact, the region of Bat-SCoV that aligns with RBDs from human and civet strains is markedly different, including several large deletions across the RBM, the region corresponding to ACE2 interface residues (Fig.
3). Further, field studies of SARS-like-CoV-positive bats show no evidence of replicating virus in respiratory swabs (
74). Certainly, these data do not address structural changes that cannot be predicted with certainty from homology modeling against the human molecular complex. However, these dichotomies leave many questions unanswered. Do bats serve as a reservoir species for SARS-like CoVs, with civets, raccoon dogs, and possibly other unidentified animals functioning as conduits, or are bats actually functioning as the direct reservoirs of the human epidemic? Is there an unidentified coreceptor for bat SARS-like CoVs that potentiates the ACE2 interaction in bats, civets, or humans? Is the mutational burden for a bat SARS-like CoV too great to be overcome in a single conduit, or were multiple conduit species involved in the establishment of the virulent strain in the human population? Finally, is the bat ACE2 molecule the
bona fide receptor for bat SARS-like CoVs and/or can these orthologs function as receptors for early human epidemic/civet SARS strains? Clearly, more empirical studies are essential to address these important questions.
Bats constitute 20% of the mammalian population on Earth and are the most divergent, widely distributed nonhuman mammalian species (
32). They have been implicated as reservoirs for a variety of diseases that affect humans, including rabies, Hendra and Nipah virus infections, and potentially for Ebola and Marburg virus infections (
3,
4,
14,
36,
70,
94-
96). That bats have been shown to harbor more than 60 different RNA viruses underscores the importance of developing reagents, including cell culture systems, for detection of viruses that propagate in these animals (
19,
46,
149).
CONCLUSIONS: THE POTENTIAL FOR REEMERGENCE OF SARS-LIKE COVS AND THE CHALLENGE OF PREVENTING THE UNKNOWN
While the primary human epidemic was quickly controlled and potential host civets were culled as a preventative measure, there is increasing evidence that bat species serve as reservoirs of not only SARS-like coronaviruses but also of multiple strains of coronaviruses, some of which are comparatively close relatives of circulating human strains (
27,
32,
40,
68,
104,
105,
132,
152). Bats, in fact, have come under increased scrutiny as harbingers of RNA virus-mediated diseases (
149) and have been proposed as the ultimate reservoir of all existing human strains of coronaviruses (
32,
74). With a theme distressingly reminiscent of the coinfection/reassortment mechanisms employed by influenza virus, coinfections with phylogenetically distinct strains of coronaviruses have been reported in a high proportion of sampled animals of the bat species
Miniopterus pusillus in Hong Kong. These bats cohabit with other
Miniopterus species that harbor yet-more-distinct coronavirus infections (
27). Based on available evidence, it seems that the question of emergence of another pathogenic human coronavirus from bat reservoirs might be more appropriately expressed as “when” than as “if”. The main unknown factor involves whether a newly emergent human coronavirus will be susceptible to neutralization or control by any therapeutic or vaccine measures developed against SARS-CoV isolates and sequence data. Furthermore, the ease of bat-human or bat-animal cross-species transmission should be thoroughly examined both by mutation-driven evolution and by RNA recombination-driven processes.
The current paradigm argues that the progenitor of SARS-CoV was a bat virus that jumped into civets, where changes were selected in the RBD that allowed for recognition of the civet ACE2 as an intermediate host prior to transmission and adaptation to the human host (Fig.
5A). This may well be the case. However, phylogenetic studies also indicate that the existing bat strains are more closely related to early human epidemic strains, which alternatively suggests direct bat-human transmission as the initial precursor event, followed by differential radiation within the two species (Fig.
5B). It is also interesting that human strains that were characterized during the epidemic maintained efficient hACE/cACE2 recognition, yet
in vitro-adapted civet strains rapidly gained hACE2 recognition (
120). These data suggest that efficient human/civet ACE2 recognition was key for maintaining SARS-CoV in human populations, providing an animal reservoir for continued persistence. If so, these data suggest that SARS-like viruses were likely persisting and causing human disease prior to the 2002-2003 outbreak on a much smaller, but noteworthy, scale. Detailed serologic studies on archived serum samples would shed considerable insight into this possibility.
Since the Spike protein continually promotes itself as a principal deciding factor in viral entry, it is perhaps logical to presume that coronavirus vaccine and therapeutic strategies should target the Spike protein and, in particular, the RBD. Multiple studies have demonstrated the cross-neutralizing potential of human monoclonal antibodies (neutralizing MAbs [nMAbs]) raised against the SARS-CoV Spike and that the vast majority of these epitopes map to the RBD (
10,
47,
114,
123,
127,
162). Indeed, results showing that a neutralization-resistant animal Spike regained susceptibility to nMAbs upon adapting to human airway epithelia are encouraging (
121). However, in the face of mounting evidence that coronaviruses are quite capable of shifting receptor affinity, an ability that would most assuredly place them outside of the prohibitory curtain of most nMAbs and perhaps of vaccines, it is becoming increasingly evident that development of therapeutic avenues and vaccines that target broader, more universally conserved alleles and a variety of loci across phylogenetic subclusters is of paramount importance, especially for emerging viruses that originate from highly heterogeneous pools of precursor zoonotic viruses. Such a strategy has already been employed for influenza virus, and a nMAb that recognizes a conserved region of the HA stem, rather than the receptor binding region, has so far proven resistant to the development of escape mutants (
129).
Investigators face a daunting black box with emerging viruses: the challenge of developing a universal therapeutic agent to combat a genetically proficient virus that quite likely has many more options for emergence than we have yet considered. In essence, the SARS-CoV outbreak and ecology reaffirm the desperate need for innovative approaches for developing vaccines and therapeutics against zoonotic viruses that exist in heterogenous, highly variant quasispecies pools. Furthermore, as a model of public health response to an emerging respiratory virus, the response to the SARS-CoV outbreak highlights the critical need to fine tune detection and diagnostic mechanisms. It is postulated that all human coronaviruses originated from bat strains (
142). The likelihood that other potentially lethal coronaviruses are harbored in bats suggests that another outbreak could occur on a similar time scale to that of the SARS-CoV outbreak, in which case response times to emergent disease would have to be measured in months or even weeks, not in years. Thus, the SARS-CoV outbreak serves as a harbinger, underscoring the absolute necessity for the development of platform strategies to rapidly counteract newly emerging disease threats before they occur.