Genome organization and coding potential of RbCoV HKU14.
Complete genome sequence data for four strains of RbCoV HKU14 were obtained by the assembly of the sequences of RT-PCR products from the RNA extracted directly from the corresponding individual specimens. The sizes of the genomes of RbCoV HKU14 were 30,904 to 31,116 bases, with a G+C content of 38%. The genome organization is similar to that of other
Betacoronavirus subgroup A CoVs, with the characteristic gene order 5′-replicase ORF1ab, hemagglutinin-esterase (HE), spike (S), envelope (E), membrane (M), nucleocapsid (N)-3′ (
Table 1 and
Fig. 1). Moreover, additional ORFs coding for nonstructural proteins, NS2a and NS5a, were found.
CoVs are characterized by a unique mechanism of discontinuous transcription with the synthesis of a nested set of subgenomic mRNAs (
47,
48). To assess the number and sizes of RbCoV HKU14 subgenomic mRNA species, a Northern blot analysis with a probe specific to the nucleocapsid sequence was performed. At least six distinct RNA species were identified, with sizes corresponding to predicted subgenomic mRNAs of HE (∼8,650 bp), S (∼7,350 bp), NS5a (∼3,030 bp), E (∼2,790 bp), M (∼2,380 bp), and N (∼1,680 bp) (
Fig. 2A).
By determining the leader-body junction sequences of subgenomic mRNAs from RbCoV HKU14-infected cell cultures, the subgenomic mRNA sequences were aligned to the leader sequence, which confirmed the core sequence of the TRS motifs as 5′-UCUAAAC-3′ (
Fig. 2B), as in other
Betacoronavirus subgroup A CoVs (
20,
31,
75,
77). The leader TRSs and subgenomic mRNAs of HE, S, and N exactly matched each other, whereas there was a one-base mismatch for NS2a, NS5a, E, and M. The RbCoV HKU14 common leader on subgenomic mRNAs was confirmed as the first 63 nucleotides of the RbCoV HKU14 genome.
The coding potential and characteristics of putative nonstructural proteins (nsp's) of ORF1 of strain RbCoV HKU14-1 are shown in
Tables 1 and
2, respectively. The ORF1 polyprotein possessed 72.0 to 87.2% amino acid identities to the polyproteins of other
Betacoronavirus subgroup A CoVs. The predicted putative cleavage sites were conserved between RbCoV HKU14 and members of
Betacoronavirus 1 (
Table 2). However, the lengths of nsp1, nsp2, nsp3, nsp13, and nsp15 in RbCoV HKU14 differed from those of corresponding nsp's in HCoV OC43, BCoV, ECoV, and/or PHEV, as a result of deletions or insertions. Interestingly, the genome of strain RbCoV HKU14-10 differed from the other three genomes in the nsp3 region by the presence of a 117-bp (39-amino-acid [aa]) deletion between PL2
pro and the Y domain, which is a variable region among CoVs and carries an unknown function (
79).
All
Betacoronavirus subgroup A CoVs, except HCoV HKU1, possess an NS2a gene between ORF1ab and HE. Although the nucleotide sequence of RbCoV HKU14 at this region showed significant homology to that of the closely related species
Betacoronavirus 1, RbCoV HKU14 is unique in having this region broken into several small ORFs. The number and size of these small ORFs vary among the four sequenced strains, with two strains having four and the other two strains having three small ORFs (
Fig. 1 and
3). Nevertheless, analysis of the amino acid sequences of these small NS2a proteins showed that they possessed significant homologies to different regions of the single NS2a proteins in HCoV OC43, BCoV, ECoV, and PHEV (
Fig. 3), and in between the small NS2a proteins of RbCoV HKU14 were deletions of amino acids conserved among the single NS2a proteins of HCoV OC43, BCoV, ECoV, and PHEV. Only two of these small NS2a proteins (NS2a4 of strain RbCoV HKU14-8 and NS2a3 of strain RbCoV HKU14-10) each contained one putative transmembrane domain predicted by TMHMM. Only the first small ORF, NS2a1, of RbCoV HKU14 was found to contain a preceding TRS, which was confirmed by the sequencing of its subgenomic mRNA leader-body junction (
Fig. 2B). While the single NS2a proteins were highly conserved among members of
Betacoronavirus 1, PHEV was found to possess a shorter single NS2a protein than HCoV OC43, BCoV, and ECoV as a result of the deletion of 84 amino acids at the C-terminal region (
Fig. 3) (
62). Although the
Betacoronavirus-specific NS2 protein has been shown to be nonessential for
in vitro viral replication (
50), cyclic phosphodiesterase domains in the NS2 proteins of coronaviruses as well as toroviruses have been predicted, and a possible role in viral pathogenicity in mouse hepatitis virus (MHV) was suggested (
8,
53). Further studies are required to understand the potential function of NS2a proteins in different betacoronaviruses, including RbCoV HKU14.
The amino acid sequence of the predicted S protein of RbCoV HKU14 is most similar to that of BCoV, with 93.6 to 94.1% identities. A comparison of the amino acid sequences of the S proteins of RbCoV HKU14 and BCoV showed 64 amino acid polymorphisms, 13 of which were seen within the region previously identified as being hypervariable among the S proteins of other
Betacoronavirus subgroup A CoVs (
6,
16,
41) (
Fig. 4), suggesting that this region in RbCoV HKU14 is also subject to strong immune selection. BCoV has been found to utilize
N-acetyl-9-
O-acetyl neuramic acid as a receptor for the initiation of infection (
49). Among the five amino acids that may affect S1-mediated receptor binding in BCoV (
78), three (threonine at position 11, asparagine at position 115, and methionine at position 118) were conserved in RbCoV HKU14 (
Fig. 4). However, at positions 172 and 178, the asparagine and glutamine observed for BCoV were replaced by histidine and lysine in RbCoV, respectively. A previous study also identified seven amino acid substitutions in the S protein of BCoV that differed between virulent and avirulent cell culture-adapted strains (
78). Interestingly, five of these seven “virulent” amino acids were also conserved in RbCoV, while amino acid substitutions were observed for the other two (valine to threonine at position 33 and aspartic acid to alanine at position 469). It was also reported previously that an amino acid change at position 531 of the S protein of BCoV discriminated between enteric (aspartic acid or asparagine) and respiratory (glycine) strains (
76). In RbCoV HKU14, an aspartic acid was conserved at this site, which may be consistent with its detection in rabbit enteric samples.
Other predicted domains in the HE, S, NS5a, E, M, and N proteins of RbCoV HKU14-1 are summarized in
Table 1. NS5a of RbCoV HKU14 is homologous to the corresponding nonstructural proteins of members of
Betacoronavirus 1, with 85.3% to 91.7% amino acid identities. In MHV, the translation of the E protein is cap independent, via an internal ribosomal entry site (IRES) (
58). However, a preceding TRS, 5′-UCCAAAC-3′, can be identified upstream of the E protein of RbCoV HKU14 (
Fig. 2B), as in members of
Betacoronavirus 1 (
77). Downstream of the N gene, the 3′-untranslated region contains a predicted bulged stem-loop structure of 64 nucleotides (nucleotide positions 30797 to 30860) conserved in betacoronaviruses (
14). Downstream of this bulged stem-loop structure (nucleotide positions 30859 to 30910), a conserved pseudoknot structure, important for CoV replication, is also present.
Phylogenetic analyses.
The phylogenetic trees constructed by using the amino acid sequences of the 3CL
pro, RdRp, helicase (Hel), S, M, and N proteins of RbCoV HKU14 and other CoVs are shown in
Fig. 5, and the corresponding pairwise amino acid identities are shown in
Table 3. For all six genes, the four strains of RbCoV HKU14 formed a distinct cluster within
Betacoronavirus subgroup A CoVs, and among the known
Betacoronavirus subgroup A CoVs, they were more closely related to members of the species
Betacoronavirus 1, BCoV, ECoV, PHEV, and HCoV OC43, than to MHV and HCoV HKU1. However, a comparison of the amino acid sequences of the seven conserved replicase domains (ADP-ribose 1″-phosphatase [ADRP], nsp5 [3CL
pro], nsp12 [RdRp], nsp13 [Hel], nsp14 [3′-to-5′ exonuclease {ExoN}], nsp15 [NendoU], and nsp16 [ribose-2′-
O-methyltransferase {O-MT}]) for coronavirus species demarcation (
7) showed that RbCoV HKU14 possessed <90% amino acid identities to members of
Betacoronavirus 1 in the ADRP (except ECoV) and NendoU domains (see Table S2 in the supplemental material), indicating that RbCoV HKU14 represented a separate species among members of
Betacoronavirus subgroup A. Based on the present results, we propose a novel species, rabbit coronavirus HKU14 (RbCoV HKU14), to describe this virus under
Betacoronavirus subgroup A CoVs.
Recombination analyses.
Interestingly, changes in the phylogenetic position in relation to members of
Betacoronavirus 1 were observed among different regions of the RbCoV HKU14 genome (
Fig. 5). For Hel, RbCoV HKU14 is most closely related to ECoV, with 98.8 to 99% amino acid identities, than to BCoV, PHEV, and HCoV OC43. As for S and N, it is more closely related to BCoV and HCoV OC43 than to ECoV and PHEV. This suggests that recombination may have occurred among these viruses during their evolution. Bootscan analysis detected potential recombination at various sites of the RbCoV HKU14 genome, most notably at around positions 7100 and 20350 (
Fig. 6A).
Upstream of position 7100, RbCoV HKU14 exhibited high bootstrap support for clustering with ECoV, but an abrupt drop in clustering was observed downstream of position 7100. Similarity plot and sequence alignment analyses showed that upstream of position 7100, RbCoV HKU14 possessed a higher level of sequence similarity to ECoV than to PHEV, BCoV, and HCoV OC43, as a result of deletions in the latter three viruses (
Fig. 6B, and see Fig. S1A in the supplemental material). However, such close similarity to ECoV was no longer observed downstream of position 7100. This suggested that RbCoV HKU14 and ECoV have probably coevolved at the region upstream of position 7100, although a recombination event could not be ascertained.
Upstream of another potential recombination site at position 20350, RbCoV HKU14 exhibited high bootstrap support for clustering with ECoV, but an abrupt drop in clustering was observed downstream of position 20350. Similarity plot and sequence alignment analyses showed that upstream of position 20350, RbCoV HKU14 possessed a higher level of sequence similarity to ECoV than to PHEV, BCoV, and HCoV OC43, whereas downstream of position 20350, RbCoV HKU14 possessed a higher level of sequence similarity to BCoV and HCoV OC43 than to ECoV and PHEV (
Fig. 6B, and see Fig. S1B and S1C in the supplemental material). A phylogenetic analysis of partial sequences upstream and downstream of position 20350 also showed a shift of the phylogenetic clustering of RbCoV HKU14, which clustered with ECoV upstream (
Fig. 6C) and with BCoV and HCoV OC43 downstream (
Fig. 6D) of position 20350. These findings indicated that recombination may have taken place at around position 20350 (corresponding to nsp15) between ECoV and BCoV/HCoV OC43 in the generation of RbCoV HKU14.