INTRODUCTION
Coronaviruses (CoVs) infect humans and a wide variety of animals, causing respiratory, enteric, hepatic, and neurological diseases of various degrees of severity. They have been classified traditionally into groups 1, 2, and 3 based on genotypic and serological characteristics (
1,
2). Recently, the nomenclature and taxonomy of CoVs have been revised by the Coronavirus Study Group of the International Committee for Taxonomy of Viruses (ICTV). They are now classified into three genera,
Alphacoronavirus,
Betacoronavirus, and
Gammacoronavirus, replacing the three traditional groups (
3). Novel CoVs, which represented a novel genus,
Deltacoronavirus, have also been identified (
4,
5). While CoVs from all four genera can be found in mammals, bat CoVs are likely the gene source of
Alphacoronavirus and
Betacoronavirus, and avian CoVs are the gene source of
Gammacoronavirus and
Deltacoronavirus (
5–7).
CoVs are well known for their high frequency of recombination and mutation rates, which may allow them to adapt to new hosts and ecological niches (
1,
8–12). This is best exemplified by the severe acute respiratory syndrome (SARS) epidemic, which was caused by SARS CoV (
13,
14). The virus has been shown to have originated from animals, with horseshoe bats as the natural reservoir and palm civet as the intermediate host allowing animal-to-human transmission (
15–18). Since the SARS epidemic, many other novel CoVs in both humans and animals have been discovered (
4,
7,
19–24). In particular, a previously unknown diversity of CoVs has been described in bats from China and other countries, suggesting that bats are important reservoirs of alphaCoVs and betaCoVs (
16,
18,
25–32).
In September 2012, two cases of severe community-acquired pneumonia were reported in Saudi Arabia which were subsequently found to have been caused by a novel CoV, Middle East respiratory syndrome coronavirus (MERS-CoV), previously known as human betaCoV 2c EMC/2012 (
33,
34,
35). As of May 2013, a total of 49 laboratory-confirmed cases of MERS-CoV infection have been reported with 27 deaths (
36), giving a crude fatality rate of 55%. So far, most cases of MERS-CoV infection presented with severe acute respiratory illness (
36,
37). A macaque model for MERS-CoV infection has also been established which showed that the virus caused localized-to-widespread pneumonia in all infected animals (
38). The viral virulence may be related to the ability of MERS-CoV to evade the innate immunity with an attenuated beta interferon response (
39–41). Moreover, the ability to cause human-to-human transmission has raised the possibility of another SARS-like epidemic (
36,
37). However, the source of this novel CoV is still obscure, which has hindered public health and infection control strategies for disease prevention. Phylogenetically, MERS-CoV belongs to
Betacoronavirus lineage C, being closely related to
Tylonycteris bat CoV HKU4 (Ty-BatCoV HKU4) and
Pipistrellus bat CoV HKU5 (Pi-BatCoV HKU5), previously discovered in lesser bamboo bat (
Tylonycteris pachypus) and Japanese pipistrelle (
Pipistrellus abramus) in Hong Kong, China, respectively (
31,
32,
42,
43). Moreover, potential viruses with partial gene sequences closely related to MERS-CoV have also been detected in bats from Africa, Europe, and America, although complete genome sequences were not available (
44,
45). MERS-CoV is able to infect various mammalian cell lines, including primate, porcine, bat, and rabbit cells, which may be explained by the use of the evolutionarily conserved dipeptidyl peptidase 4 (DPP4) as its functional receptor (
46,
47). These results suggested that MERS-CoV may possess broad species tropism and may have emerged from animals. However, the direct ancestor virus and animal reservoir of MERS-CoV are yet to be identified.
To better understand the evolutionary origin of MERS-CoV and the possible role of bats as the reservoir for its ancestral viruses, studies on the genetic diversity and evolution of lineage C betaCoVs in bats would be important. We attempted to study the epidemiology of lineage C betaCoVs, including Ty-BatCoV HKU4 and Pi-BatCoV HKU5, among various bat species in Hong Kong, China. The complete RNA-dependent RNA polymerase (RdRp), spike (S), and nucleocapsid (N) genes of 13 Ty-BatCoV HKU4 and 15 Pi-BatCoV HKU5 strains were sequenced to assess their genetic diversity and evolution. The results revealed that the two viruses were stably evolving in their respective hosts and diverged from their common ancestor a long time ago. However, the S protein of Pi-BatCoV HKU5 exhibited marked sequence divergence and many more positively selected sites than that of Ty-BatCoV HKU4, which may suggest the ability of Pi-BatCoV HKU5 along with its host to occupy new ecological niches. The potential implications on the animal origin of MERS-CoV are also discussed.
DISCUSSION
In this study, Ty-BatCoV HKU4 and Pi-BatCoV HKU5 were found to be highly prevalent among lesser bamboo bat and Japanese pipistrelle in Hong Kong, respectively, with detection rates of 25% to 29% in their alimentary samples. In line with previous studies, MERS-CoV is more closely related to
Betacoronavirus lineage C than to lineages A, B, and D in the RdRp, S, and N genes (
34,
42,
43). Nevertheless, the genetic distance between MERS-CoV and the various strains of Ty-BatCoV HKU4 and Pi-BatCoV HKU5 was still large, with their S proteins having ≤67.4% aa identity. Two recent studies have identified partial gene sequences closely related to MERS-CoV in bats from Africa, Europe, and America, suggesting that lineage C betaCoVs are distributed in bats worldwide (
44,
45). In one study, CoVs related to MERS-CoV were detected in 46 (24.9%)
Nycteris bats and 40 (14.7%)
Pipistrellus bats from Ghana and Europe using RT-PCR targeting a 398-bp fragment of the RdRp gene (
44). The extended 904-bp RdRp sequences of three strains from Romania and Ukraine showed that they shared 87.7% to 88.1% nucleotide and 98.3% amino acid identity with those of MERS-CoV compared to 80.3% to 82% and 82.4% to 83.7% nucleotide and 92% to 92.4% and 94 to 94.4% amino acid identity between Ty-BatCoV HKU4/Pi-BatCoV HKU5 and MERS-CoV, respectively, in the corresponding regions. In another study, screening of 606 bats from Mexico also showed the presence of a closely related betaCoV MERS-CoV in a
Nyctinomops lacticaudatus bat (
45). Although the authors claimed to have used a 329-bp fragment of the RdRp gene for RT-PCR and sequence analysis, the available sequence was in fact within nsp14. Analysis of this partial nsp14 sequence showed that it shared 85.7% nucleotide and 95.5% amino acid identity with that of MERS-CoV (
45) compared to 81.9% and 83.4% to 84.2% nucleotide and 88.6% and 92% amino acid identity differences between Ty-BatCoV HUK4/Pi-BatCoV HKU5 and MERS-CoV, respectively, in the corresponding regions. However, complete gene sequences were not available from these bat CoVs to allow more detailed phylogenetic analysis. Molecular clock analysis of the complete RdRp gene dated the tMRCA of MERS-CoV and Pi-BatCoV HKU5 at around 1520, whereas analysis of the N gene dated the tMRCA of MERS-CoV, Ty-BatCoV HKU4, and Pi-BatCoV HKU5 at around 1324. Using the 904-bp RdRp sequences available from the three European strains, the tMRCA of MERS-CoV and European bat CoV strains were dated at around 1859. Our results suggested that Ty-BatCoV HKU4, Pi-BatCoV HKU5, and MERS-CoV diverged at least centuries ago from their common ancestor. Although MERS-CoV and the European bat CoV strains were estimated to have diverged more recently, this is unlike the situation in SARS-related CoVs, which diverged between civet and bat strains only a few years before the SARS epidemic (
17). Therefore, these bat lineage C betaCoVs were unlikely to be the direct ancestor of MERS-CoV. However, the present analysis is limited by the lack of more sequences from potential intermediate virus species or strains with widely distributed and well-determined dates, which better reflect the different selective pressures over the long period of time as these viruses evolved. Further studies on bats and other animals are required to fill the gap between these bat lineage C betaCoVs and MERS-CoV during their evolution. Moreover, longer gene or complete genome sequence data from these animal viruses would be important for more accurate taxonomic and evolutionary studies.
The divergent sequences of the S genes of Pi-BatCoV HKU5 may suggest that the virus has a better ability to generate variants to occupy new ecological niches. The S proteins of CoVs are responsible for receptor binding and host adaptation and are therefore among the most variable regions within CoV genomes (
16,
18,
28). Studies on SARS CoV have shown that changes in its S protein, both within and outside the receptor binding domain, could govern CoV cross-species transmission and emergence in new host populations (
83,
84). We have also previously demonstrated recent interspecies transmission of an alphaCoV, BatCoV HKU10, from Leschenault's rousettes to Pomona leaf-nosed bats, and the virus has been rapidly adapting in the new host by changing its S protein (
59). In this study, Ty-BatCoV HKU4 and Pi-BatCoV HKU5 were exclusively detected in lesser bamboo bat (
Tylonycteris pachypus) and Japanese pipistrelle (
Pipistrellus abramus), respectively. Moreover, the
Ka/
Ks ratios of the RdRp, S, and N genes in both viruses were low, supporting the idea that the two bat species were the respective primary reservoirs for the two CoVs. Nevertheless, in comparison to that of Ty-BatCoV HKU4, the S gene of Pi-BatCoV HKU5 exhibited much higher sequence divergence among different strains due to both synonymous and nonsynonymous substitutions. Moreover, a much higher number of positively selected sites were observed in the S gene of Pi-BatCoV HKU5 than in that of Ty-BatCoV HKU4, with most of the sites under selection being distributed within the S1 region which likely contains the RBD. This suggested that the S1 region of Pi-BatCoV HKU5 may have been under functional constraints in its host species, Japanese pipistrelle, which may have favored adaptation to new hosts or environments.
The marked polymorphisms in the S protein of Pi-BatCoV HKU5 may reflect the biological characteristics of its host species, Japanese pipistrelle, which is a small-size, insectivorous bat with a body weight of 4 to 10 g. It is considered the most common bat species found in urban areas of Hong Kong (
85). While it is abundant in wetland areas, its roosts are frequently found in towns and villages as well as in various types of buildings and other man-made structures, such as fans or air conditioners. It is also known to utilize bat houses or boxes as its roosts. Such diverse habitat and adaptability to harsh environments may have favored the mutation of Pi-BatCoV HKU5, especially in its S protein, which is responsible for receptor binding and immunogenicity. Interestingly, this bat species is widely distributed not only in China, Russia, the Korean peninsula, Japan, Vietnam, Burma, and India but also in the Kingdom of Saudi Arabia and neighboring countries (
42,
85). Moreover, other
Pipistrellus bats, including
P. arabicus,
P. ariel,
P. kuhlii,
P. pipistrellus,
P. rueppellii, and
P. savii, have been recorded in the Arabian Peninsula (
http://www.iucn.org/). In fact, partial sequences closely related to those of MERS-CoV detected in bats from Europe also originated from
Pipstrellus bats (
P. pipistrellus,
P. nathusii, and
P. pygmaeus) of the family
Vespertilionidae, and those from Ghana originated from
Nycteris bats (
Nycteris cf. gambiensis) of the related family
Nycteridae (
44). Similarly, the bat betaCoV strain related to MERS-CoV detected in Mexico originated from a
N. laticaudatus bat belonging to the
Molossidae, a closely related family of
Vespertilionidae (
45,
86). The difference between this BatCoV and MERS-CoV within the partial nsp14 sequence was also found to be mainly due to substitutions in the third nucleotide positions, suggesting strong purifying selection (
45). However, S gene sequences were not available from these bat viruses for further analysis of polymorphisms and selective pressures. Nevertheless, based on our existing data, bats belonging to
Vespertilionidae and related families, especially
Pipistrellus bats and those with diverse habitats, in the Arabian Peninsula should be intensively sought for potential ancestral viruses of MERS-CoV, which may have evolved through mutations in the S gene, especially in the RBD, allowing efficient transmission to other animals or human. In contrast, lesser bamboo bats, the host species for Ty-BatCoV HKU4 and one of the smallest mammals in the world, with a body weight of 3 to 7 g, have much more restricted habitats. Though this species also belongs to the family
Vespertilionidae, it is remarkably adapted to roost inside bamboo stems and is mainly found in rural areas in Hong Kong and various Asian countries (
85). This may, in turn, reflect the lower mutation rate observed in the S gene of Ty-BatCoV HKU4.
It remains to be determined if Ty-BatCoV HKU4 and Pi-BatCoV HKU5, as well as other lineage C betaCoVs in bats, utilize the same receptor as MERS-CoV. Recent studies have shown that MERS-CoV utilizes DPP4 as its functional receptor (
47,
79). This suggested that these betaCoVs belonging to lineage C may utilize a receptor(s) different from those of other CoVs. Moreover, expression of bat (
P. pipistrellus) DPP4 in nonsusceptible cells was found to enable infection by MERS-CoV (
47), which is in line with the ability of the virus to replicate in cell lines from
Rousettus,
Rhinolophus,
Pipistrellus,
Myotis, and
Carollia bats (
79). As DPP4 is a evolutionarily conserved protein (
47), it may also explain the broad species tropism observed in primate, porcine, and rabbit cell lines and reflect the zoonotic origin of MERS-CoV (
46,
79). However, Ty-BatCoV HKU4 and Pi-BatCoV HKU5, as with other bat CoVs, have not been successfully cultured
in vitro, which hampers studies on their receptor binding and host adaptation. Further discoveries of lineage C betaCoVs in animals and studies on the receptors of the different animal counterparts in their respective hosts may help further understanding of the mechanism of interspecies transmission and emergence of MERS-CoV.
Bats are increasingly recognized as a reservoir for various zoonotic viruses, including SARS CoV, lyssavirus, and rabies virus and Hendra, Nipah, and Ebola as well as influenza virus (
87,
88). While the existence of CoVs in bats was unknown before the SARS epidemic, it is now known that the different bat populations harbor diverse CoVs, which is likely the result of their species diversity, roosting behavior, and migrating ability (
16,
18,
29,
31,
32,
89). These warm-blooded flying vertebrates are also ideal hosts to fuel CoV recombination and dissemination (
5,
27,
59). It remains to be ascertained if bats could also be the animal origin for the emergence of MERS-CoV either directly or via an intermediate host, the latter as in the case of SARS CoV, where the bat ancestral virus may have jumped to the intermediate host when bats were in contact or mixed with other animals (
16). Since the history of contact with animals such as camels and goats has been reported in MERS-CoV-infected cases (
90), the virus may have jumped from bats to these animals before infecting humans. Surveillance studies of lineage C betaCoVs from bats and other animals in the Middle East may help identify the origin and chain of transmission of MERS-CoV.