Volume 48, Issue 3 p. 617-629
Free Access

RNA-modifying machines in archaea

Arina D. Omer

Arina D. Omer

Department of Biochemistry and Molecular Biology, University of British Columbia, 2146 Health Sciences Mall, Vancouver, BC, V6T 1Z3, Canada.

Search for more papers by this author
Sonia Ziesche

Sonia Ziesche

Department of Biochemistry and Molecular Biology, University of British Columbia, 2146 Health Sciences Mall, Vancouver, BC, V6T 1Z3, Canada.

Search for more papers by this author
Wayne A. Decatur

Wayne A. Decatur

Department of Biochemistry and Molecular Biology, Lederle Graduate Research Tower, Box 34505, University of Massachusetts, Amherst, MA 01003–4505, USA.

Search for more papers by this author
Maurille J. Fournier

Maurille J. Fournier

Department of Biochemistry and Molecular Biology, Lederle Graduate Research Tower, Box 34505, University of Massachusetts, Amherst, MA 01003–4505, USA.

Search for more papers by this author
Patrick P. Dennis

Patrick P. Dennis

Department of Biochemistry and Molecular Biology, University of British Columbia, 2146 Health Sciences Mall, Vancouver, BC, V6T 1Z3, Canada.

Division of Molecular and Cellular Biosciences, National Science Foundation, 4201 Wilson Blvd., Arlington, VA 22230, USA.

Search for more papers by this author
First published: 09 April 2003
Citations: 71
*For correspondence. E-mail [email protected]; Tel. (+1) 703 2927145; Fax (+1) 703 2929061.

Summary

It has been known for nearly half a century that coding and non-coding RNAs (mRNA, and tRNAs and rRNAs respectively) play critical roles in the process of information transfer from DNA to protein. What is both surprising and exciting, are the discoveries in the last decade that cells, particularly eukaryotic cells, contain a plethora of non-coding RNAs and that these RNAs can either possess catalytic activity or can function as integral components of dynamic ribonucleoprotein machines. These machines appear to mediate diverse, complex and essential processes such as intron excision, RNA modification and editing, protein targeting, DNA packaging, etc. Archaea have been shown to possess RNP complexes; some of these are authentic homologues of the eukaryotic complexes that function as machines in the processing, modification and assembly of rRNA into ribosomal subunits. Deciphering how these RNA-containing machines function will require a dissection and analysis of the component parts, an understanding of how the parts fit together and an ability to reassemble the parts into complexes that can function in vitro. This article summarizes our current knowledge about small-non-coding RNAs in Archaea, their roles in ribosome biogenesis and their relationships to the complexes that have been identified in eukaryotic cells.

Introduction

Ribosome biosynthesis is an extraordinarily complex process that is of fundamental importance in all living cells. In eukaryotes, the biogenesis of cytoplasmic ribosomes is localized in a specialized subnuclear compartment, the nucleolus. Often referred to as the factory for ribosome production, the nucleolus harbours most steps in the pathway leading to the production of mature ribosomal subunits; these include transcription, modification, processing and folding of precursor rRNA (prerRNA) and assembly of the rRNAs along with ribosomal proteins into small and large ribosomal subunits (reviewed in Bachellerie and Cavaille, 1998; Fattica and Tollervey 2002). In most eukaryotic species the rRNA genes are clustered, with the coding units for the small subunit (18S) and two of the three large subunit (5.8S and 25S/28S) rRNA genes forming an operon, which is repeated several hundred times in the genome. These ribosomal RNA operons are co-transcribed by RNA polymerase I to form a long precursor from which the external 5′ and 3′ transcribed spacers (ETSs) and the two internal transcribed spacers (ITS1 and ITS2) are removed through a series of endo- and exonucleolytic processing events. The 5S rRNA gene is separately transcribed by RNA polymerase III, presumably in the nucleoplasm. Available evidence suggests that before processing of the primary transcript is completed, the mature rRNA sequences are heavily modified – mainly by 2′-O-ribose methylation and by pseudouridylation – at about 200 and 100 sites in mammals and yeast respectively (Maden, 1990; Lane et al., 1995). Neither processing nor covalent modification takes place on naked RNA; instead they occur concomitantly with the addition of ribosomal proteins (r-proteins) and other trans-acting factors (non-ribosomal proteins and RNAs) that contribute to the formation of intermediate particles in the pathway leading to the production of mature small and large ribosomal subunits. The final steps of ribosomal subunit assembly occur in the cytoplasm.

In addition to rRNA precursors, the nucleolus contains scores of small, metabolically stable RNAs called small nucleolar RNAs (snoRNAs). These non-coding snoRNAs are present as ribonucleoprotein particles (snoRNPs) that act as molecular machines and function at various steps in the ribosome biosynthetic pathway (Kiss, 2001; Decatur and Fournier, 2003; and references therein). These machines have been implicated in the cleavage, nucleotide modification and folding of the precursor rRNA substrate. The snoRNAs fall into three distinct classes: the C/D box RNAs, the H/ACA RNAs and the MRP RNA. The C/D box snoRNAs guide the 2′O-ribose methylation to specific locations in rRNA. All members of this family contain two short consensus motifs designated box C (RUGAUGA) and D (CUGA) situated, respectively, near the 5′- and 3′-end of the molecule and typically a second degenerate version of these boxes (referred to as C′ and D′ boxes) near the centre of the molecule (Bachellerie and Cavaille, 1998; Kiss, 2001; Bachellerie et al., 2002; Fig. 1A.

Details are in the caption following the image

Structural features of two major classes of eukaryotic snoRNAs. The C/D box and H/ACA box snoRNAs use antisense guide elements to target ribose methylation and pseudouridylation in rRNA.
A. Most C/D box snoRNAs contain one or two regions of complementarity to rRNA that are positioned 5′ to the D or D′ box; 2′-O-ribose methylation is directed to the nucleotide in rRNA that participates in a Watson-Crick base pair 5 nucleotides upstream of the D′ or D box. Most of these RNAs have a short-terminal hairpin.
B. The H/ACA box snoRNAs contain one or two regions of hyphenated complementarity to rRNA that are within the bulge regions of the 5′ or 3′ helices; base pairing to rRNA positions the uridine nucleotide to be modified in a pocket between the hyphenated regions of rRNA-snoRNA complementarity.

Most C/D box RNAs contain a 4–5 nucleotide long duplex hairpin that is important for bringing the C and D motifs into close proximity. This terminal stem region is likely to act as a recognition signal for the binding of protein factors required during snoRNA biosynthesis and nucleolar localization (Cavaille et al., 1996; Caffarelli et al., 1998; Samarsky et al., 1998). For the C/D box snoRNAs that lack the terminal helical motif, the juxtaposition of the C and D elements is mediated by internal or external stems flanking the boxes (Darzacq and Kiss, 2000; Villa et al., 2000). The characteristic feature defining the box C/D methylation guide snoRNAs is the 10–21 nucleotide long guide sequence located upstream of the D or D′ box that is complementary to the rRNA and spans the site of methylation (Bachellerie et al., 1995; Bachellerie et al., 2002). Target selection relies on the canonical duplex formation between the guide region of the snoRNA and the rRNA target, such that the residue to be methylated in the target RNA base-pairs with the fifth nucleotide upstream of the start of the D or D′ box. This is the N plus five rule (Cavaille et al., 1996; Kiss-Laszlo et al., 1996; 1998; Nicoloso et al., 1996). Initial studies identified fibrillarin as the signature protein cofactor that associates with all C/D box snoRNAs, but the repertoire of snoRNA-associated proteins has now expanded to include the two paralogous proteins, NOP56, NOP58 and the 15.5 kDa (human)/Snu13p (yeast) protein (Filipowicz and Pogacic, 2002).

The second class, the box H/ACA snoRNAs, targets the conversion of uridine to pseudouridine at specific location in rRNA. Representatives of this class fold into a typical hairpin–hinge–hairpin–tail secondary structure (Balakin et al., 1996; Ganot et al., 1997; Fig. 1B). Within this structure, the box H (ANANNA, where N can be any nucleotide) is positioned in the single-stranded hinge region, and the conserved ACA sequence is invariably located in the single-stranded tail, ending three nucleotides from the 3′-terminus of the molecule. Like the C/D box snoRNAs, the H/ACA box members base pair with the rRNA target by forming two short 4–8 nucleotide duplexes that surround the residue to be isomerized. The uridine target remains unpaired within the ‘pseudouridylation pocket’, situated 14–17 nucleotides upstream of the H or ACA box (Ganot et al., 1997; Ni et al., 1997). Like the C and D box motifs, the H and ACA boxes play critical roles in processing, accumulation and localization of snoRNAs (Balakin et al., 1996; Bortolin et al., 1999). All H/ACA snoRNAs specifically associate with the nucleolar protein Gar1p and several other proteins including the putative pseudouridine synthase Cbf5p, Nhp2p and Nop10p (Henras et al., 1998; Watkins et al., 1998).

The third class of snoRNAs contains only a single representative, the MRP RNA. An RNP complex containing this RNA and nine protein components has been shown to be involved in the endonucleolytic processing within the ITS1 region of precursor rRNA (Lygerou et al., 1994). The MRP RNA is related to the RNA component of RNase P endonuclease, the complex that generates the mature 5′-end of tRNAs (Chamberlain et al., 1998). Remarkably, eight of the RNase MRP protein components are also shared with the RNase P. The MRP RNP complex has also been linked to mitochondrial DNA replication (Stohl and Clayton, 1992).

Growing evidence suggests that at least some of these RNP machines, once thought to be exclusively nucleolar, are found in other areas of the nucleus and are involved in processes other than the modification of rRNA. For example, several C/D box and H/ACA box RNAs have been implicated in the nucleotide modification of spliceosomal snRNAs and possibly also mRNAs and have been localized within coiled or Cajal bodies (Tycowski et al., 1998; Ganot et al., 1999; Cavaille et al., 2000; Darzacq et al., 2002; Kiss et al., 2002). The Cajal bodies are dynamic nuclear organelles that concentrate many components involved in the transcription and processing of nuclear RNAs and additionally are thought to play an important role in the biogenesis, maturation and export of snRNPs to nuclear speckles and of snoRNPs to the nucleolus (Gall, 2000; Spector, 2001). The small sno-like RNAs residing permanently in Cajal bodies appear to be distinct and have been named scaRNAs (Kiss et al., 2002).

Within this perspective, understanding the function, versatility and diversity of sno- and sno-like RNPs will require a detailed characterization of their cellular trafficking patterns, their catalytic activities, and the network of interactions that occur between the RNAs, their protein constituents and other cellular components. In complex eukaryotic cells, these guide-containing RNP complexes are transient and unstable and their molecular architecture has been difficult to establish or reconstitute. The best progress in purification and functional characterization of these complexes to date has involved purification of yeast C/D and H/ACA RNP complexes that retain methylation and pseudouridylation activity (Galardi et al., 2002; Wang et al., 2002). Are there simpler systems where the biochemical, structural and functional complexities of these RNP machines can be more easily addressed?

Archaea contain small RNA-modifying machines

Archaea have been defined as a phylogenetically distinct group of prokaryotic organisms that comprise two main branches: the Euryarchaeota that includes the methanogens and the extreme halophiles, and the Crenarchaeota that includes the sulphur dependent thermophilic acidophiles (Fox et al., 1980; Olsen and Woese, 1997). A trait common to all these archaeal organisms is the presence of mosaic bacterial and eukaryal features. Archaea are bacterial-like for features such as cell structure, genome organization, and structure and function of enzymes involved in basic metabolism, but they are eukaryal-like for features involved in information storage and transfer (i.e. the machineries used for DNA replication, RNA transcription and protein translation; Dennis, 1997). In terms of the extent of 2′-O-ribose methyl modification, the level of modification in the rRNAs of at least one archaeal species appears similar to that present in eukaryotes. In Sulfolobus solfataricus there are 67 sites of ribose methyl modification in the small and large subunit rRNAs, a number intermediate to the number found in yeast (55) and humans (105) (Maden, 1990; Noon et al., 1998). Moreover, sequenced archaeal genomes encode three proteins that display a high degree of sequence similarity to the eukaryotic snoRNA-associated proteins: archaeal fibrillarin (aFIB), archaeal Nop56 (aNOP56; Lafontaine and Tollervey, 1998) and archaeal L7a (aL7a, which is homologous to the 15.5 kDa/Snu13p protein; Kuhn et al., 2002; Omer et al., 2002).

The first demonstration that Archaea possess guide RNAs that direct methylation to rRNA came from biochemical studies with Sulfolobus (Gaspin et al., 2000; Omer et al., 2000). Antibodies against the aFIB and aNOP56 proteins from Sulfolobus were used to immunoprecipitate complexes containing these proteins from cell extracts. Nearly 30 different cDNA clones with features characteristic of eukaryotic C/D box RNAs were recovered from the RNA present in these precipitates. The RNAs contained antisense sequences complementary to rRNA and methyl modification of rRNA was demonstrated using a primer extension pause reaction. Based on the sequence features of the cloned Sulfolobus sRNAs, a probabilistic model was constructed and used to search other archaeal genomes for the presence of additional genes encoding C/D box RNAs. Up to 50 or more genes encoding snoRNA-like RNAs were identified in the various species of Archaea.

Curiously, the number of easily identifiable sRNAs in a particular genome is related to the optimum growth temperature – more sRNA gene candidates have been identified in organisms with a high growth temperature than organisms with a low growth temperature (Dennis et al., 2001). What is the function of 2′-O-ribose methyl modification in rRNA, how did guides evolve to target modifications and why do the number of guides and methyl modifications relate to the growth temperature? Although the answers to these questions are currently unknown, it is believed that ribose methylation contributes to the stabilization of secondary structure in RNA (Davis, 1998). In addition, it has been suggested that the guide RNAs may play a chaperone function (Bachellerie et al., 1995; Maxwell and Fournier, 1995; Michot et al., 1999; Dennis et al., 2001); their base pairing with the nascent rRNA during the assembly process may direct the efficient localized folding of the rRNA and thereby prevent the formation of non-productive structures that block or delay the ribosome assembly process. Higher growth temperatures may require more stabilization of RNA structure and more chaperones to direct the folding process (Dennis et al., 2001).

What are the component parts and how do they fit together?

To identify additional components of archaeal sRNP particles, Sulfolobus solfataricus complexes were purified using sucrose gradient centrifugation, ion exchange chromatography and immunoaffinity chromatography. Purification was monitored using Western blot analysis with anti-aFIB. In addition to aFIB and aNOP56, a third low molecular weight protein was observed on SDS-PAGE. N-terminal amino acid sequencing identified this protein as a previously annotated aL7a ribosomal protein (A. D. Omer, unpubl. data; Ban et al., 2000; Omer et al., 2002). Using immunoaffinity purification and Western blot analysis it was possible to demonstrate that all three proteins are present in the complex (A. D. Omer, unpubl. data).

The aL7a protein exhibits sequence homology to several eukaryotic ribosomal proteins such as L7a, S12 (human) and L30 (yeast) as well as non-ribosomal proteins including the H/ACA box snoRNP Nhp2p protein and the spliceosomal 15.5 kDa/Snu13p protein. The eukaryotic 15.5 kDa/Snu13p protein (also called NHPX; Leung and Lamond, 2002) has two functions, as a core factor that binds the conserved 5′ asymmetric loop structure of the spliceosomal U4 snRNA, and as a core factor that binds to the box C/D motif in all C/D box snoRNAs. Determination of the crystal structure of the 15.5 kDa protein bound to the 5′-stem–loop structure of U4 snRNA revealed the RNA motif that serves as the site for protein recognition (Vidovic et al., 2000). This motif consists of an asymmetric (2 + 5 nucleotides) loop stabilized by tandem, sheared G:A base pairs and contains an unpaired, protruding uridine critical for 15.5 kDa protein recognition (Fig. 2). The loop is flanked by two tightly packed helical structures, designated stem I and II. The RNA motif, defined as the K-turn, is present in a multitude of RNAs in all three kingdoms of life; these include the eukaryotic and archaeal C/D box sno- and sno-like RNAs where the C and D boxes become part of an asymmetric loop similar to the RNA binding motif present in U4 snRNA (Watkins et al., 2000; Klein et al., 2001).

Details are in the caption following the image

Predicted secondary structure of the RNA motif binding the 15.5 kDa protein.
A. The secondary structure of the K-turn in the conserved 5′-stem loop structure of the spliceosomal U4 snRNA and in the C/D box consensus sequences present in the C/D box snoRNAs is illustrated.
B. A structure of the human 15.5 kDa protein bound to the K turn motif in the 5′-stem–loop fragment of U4 snRNA is illustrated (taken from Vidovic et al., 2000). Residue U31 is flipped out of the loop and inserts into a cavity of the protein while two adjacent A:G base pairs (G32:A44 and A33: G43) extend stem II into the loop.

The demonstration that archaeal aL7a is the functional homologue of the 15.5 kDa/Snu13p protein came from in vitro studies using recombinant protein and either a typical archaeal or eukaryotic C/D box RNA transcript (Kuhn et al., 2002; Omer et al., 2002). Neither aFIB nor aNOP56 exhibit sRNA binding activity on their own or in combination. However, the presence of aL7a bound to the sRNA nucleates the ordered addition of first aNOP56 and then aFIB to the complex (Omer et al., 2002; Fig. 3A).

Details are in the caption following the image

Reconstitution and activity of an archaeal C/D box RNP complex.
A. Recombinant aL7a, aNOP56 and aFIB proteins were added individually or in combination to an in vitro transcript of S. acidocaldarius C/D box sR1 sRNA. The mixtures were separated on a 6% non-denaturing PAGE to resolve complexes. Complex I, sR1 sRNA-aL7a; complex II, sR1 sRNA-aL7a-aNOP56; complex III, sR1 sRNA-aL7a-aNOP56- aFIB. A secondary structural model of sRNA is depicted on the right. The base–pair interaction between U52 in 16S rRNA target and the position + 5 in the guide sR1 is boxed. The aL7a protein is predicted to bind to the loops generated by the C/D or C’/D’ motifs (indicated by *). The base predicted to rotate out of the loop and insert into the pocket of the protein is the first U residue in the C or C’ box sequence and is highlighted in black.
B. RNP complexes were assembled using in vitro transcribed sR1 sRNA (120 pmoles), target RNA(120 pmoles), aFIB (or P129V and A85V mutants of aFIB), aNOP56, aL7a (4 pmoles of each protein) and [3H]-methyl-S-adenosyl-methionine (60 pmoles) was added. The reactions were transferred to 70°C. The reactions were removed and precipitated with 5% trichloroacetic acid at the indicated times. The precipitates were collected on nitrocellulose filters, dried and radioactivity was determined by scintillation counting (see Omer et al., 2002). The predicted secondary structure of the guide and target RNAs are indicated on the right and the kinetics of methyl incorporation are shown on the left. The A85V aFIB mutant gave the same activity as the control reaction lacking the aFIB protein (adapted from Omer et al., 2002).

A similar, stepwise mechanism of complex formation was recently characterized for the assembly of the human U4/U6 snRNA complex (Nottrott et al., 2002). The homologue of the aL7a protein, the small 15.5 kDa protein described above, binds directly to the 5′-stem–loop structure of U4 snRNA. Formation of this binary complex is a prerequisite for the association of the newly identified 61 kDa protein. Interestingly, the central region of the 61 kDa protein, spanning amino acid residues 92–328, displays significant sequence similarity to the C/D box snoRNP specific proteins, NOP56 and NOP58 (18% identity, 57% similarity; Makarova et al., 2002). Cross-linking studies have identified His 270 within the ‘NOP homology region’ (positions 260 and 273) as the site of contact with the U4 snRNA. The corresponding contact site on the U4 snRNA, maps to the same 5′stem–loop structure that is key for binding of the 15.5 kDa nucleation protein (Nottrott et al., 2002). An alignment of the corresponding regions of the Sso aNOP56 and human 61KDa proteins, reveals 24% identity and 53% similarity between the two sequences (Fig. 4). In all human and archaeal NOP56/58 sequences, the amino acid residue corresponding to His 270 that is key for contacting the U4 RNA in the 61 KDa protein is a conserved lysine residue. At the present time there is no direct evidence that this Lys interacts with the K-turn motif within C/D box RNAs. Nonetheless, these observations suggest that eukaryotic spliceosomal and modification machineries are related and share homologous components.

Details are in the caption following the image

Spliceosomal 61 KDa protein shares significant sequence homology with the archaeal C/D box core protein aNOP56. A clustalw sequence alignments of NOP56 homologues with human 61 KDa protein (aa 93–328, accession number AY040822) is illustrated. The NOP56 sequences used in the alignment were aa 140–378 of the S. solfataricus aNOP56 (Sso, accession number AAK41215) and aa 168–408 of the human NOP56 (accession number Y12065). The position in the human 61K protein of the critical His 270 required for contact of U4 RNA in the Nop homology region is marked with an asterisk; this residue is a highly conserved Lys in the Nop 56/58 and aNop56 proteins.

How do these machines work?

A first demonstration of the conserved structure and function shared between the eukaryotic and archaeal box C/D sRNPs was demonstrated through microinjection of archaeal C/D box guide RNAs into Xenopus oocytes (Speckmann et al., 2002). The archaeal sRNAs localize to the nucleolus and direct site specific methyl modification of rRNA.

Using recombinant versions of the archaeal C/D box-associated proteins and in vitro transcribed C/D sRNAs, we were able not only to assemble in vitro, a sRNP core complex, but also were able to demonstrate that it possessed site specific 2′-O ribose methylation activity when provided with a short fragment of rRNA substrate and the methyl donor, S-adenosyl-methionine (SAM) (Omer et al., 2002; Fig. 3B). Approximately two pmoles of product were formed for each pmole of protein present in the reaction mixture, suggesting that the catalytic system exhibits multiple turnovers. Nucleotide mapping demonstrated that methylation was targeted specifically to the predicted position in the substrate RNA. Mutations that disrupted the Watson-Crick base pair between the guide and target sequences at the site of methylation were methylase defective, whereas compensatory mutations that restored the base pair between the guide and target regained activity. Finally, the methylase activity of the complex was shown to be dependent on the integrity of the S-adenosyl-methionine binding domain within the fibrillarin protein. Two amino acid substitutions within this binding domain (P129V and A85V) were constructed and shown to be active in RNP assembly, but partially or completely defective in the methyl-transferase function.

Positions of methylation within the three dimensional structure of the ribosome

Previous studies analysing the distribution of methylation sites within the secondary structures of the small and large rRNAs of Archaea and eukaryotes revealed two important features: first, methyl modifications are confined to the functionally important and highly conserved regions of the rRNA and second, the exact position of most methylations within these conserved regions is phylogenetically variable (Maden, 1990; Dennis et al., 2001). Analysis of the predicted sites of methyl modification in three closely related species of the archaeal genus Pyrococcus (where the genomes have been sequenced and the complete set of sRNAs is known) indicated that the guide sequences within sRNAs diverge much more rapidly than the corresponding target sequences within rRNA (Dennis et al., 2001). Moreover, deletion of single or multiple genes encoding methylation guide C/D snoRNAs in S. cerevisiae has little or no detectable phenotype, indicating that the absence of single or even a few methylation sites is not critical to ribosome function (Samarsky et al., 1995; Lowe and Eddy, 1999; Qu et al., 1999). In contrast, global methylation appears to be essential. In vivo analysis of temperature-sensitive variants of yeast fibrillarin revealed that mutations in the SAM-binding domain disrupt methylation of nascent rRNA and assembly of new rRNA into ribosomal subunits (Tollervey et al., 1993; Wang et al., 2000).

With the availability of high-resolution structures of the small and large ribosomal subunits, it is now possible to deduce the location of the modified nucleotides in the ribosome, in 3D space (Decatur and Fournier, 2002; Hansen et al., 2002; Ofengand, 2002). To this end, we have developed the first 3D modification map for an archaeal ribosome, in particular, for the ribose methylations in the Pyrococcus horikoshii ribosome (Fig. 5A–E).

Details are in the caption following the image

Three-dimensional distribution of the 2′O-ribose methylated positions predicted in the P. horikoshii rRNA. Sites of methylation were deduced from the sequences of candidate sRNAs identified in the P. horikoshi genome (http:rnawustl.edusnoRNAdb). Equivalent positions are highlighted in the crystal structures of the small and large ribosomal subunits, derived for Thermus thermophilus (PDB entry 1fjf) and Haloarcula marismortui (PDB entries 1ffk and 1ffz) respectively.
A and B. The P. horikoshii SSU rRNA contains 40 predicted target sites. Helix 44 of the SSU rRNA is indicated in cyan. Major morphological features (head, neck, body) are labelled in A. Dashed lines bound the area referred to as the midsection in the text.
C–E. The large subunit map shows 54 of the 62 predicted sites of methylation; the remaining eight occur in regions not visible because of disorder in the current crystal structure. The peptidyl transferase transition-state analogue is shown bound to the reaction center of the large subunit (magenta) (C, D). Functional regions are indicated for each subunit in A and C.
In A–E, the ribose methylated nucleotides are distinguished by showing full atomic volume (van der Waals radii, green); backbone representation is used to illustrate the rRNA (grey) and protein side-chains (blue in SSU; maroon in LSU) and skeleton representation for the bases of non-targeted nucleotides (grey). In A and C, the subunit interface is towards the front. B and D are side perspectives and E is a view down the polypeptide exit tunnel of the LSU. In D, the bases of the non-targeted nucleotides and all proteins are hidden and in E the transition-state analogue is hidden to facilitate viewing the tunnel.

The set of P. horikoshii guide sRNAs consists of 60 species with the potential to target methylation to 102 rRNA sites. These sites were mapped on the crystal structures of the 30S subunit (40 sites) and 50S subunit (54 of 62 sites). Eight sites in the large subunit cannot be visualized because of disorder in four internal rRNA regions in the crystal structure. Because methylations occur in highly conserved regions of the rRNA where sequence alignments are unambiguous, it is possible to locate with high accuracy individual sites within the sequence. However, it is less certain if all of the predicted sites are in fact methylated. For example, many of the sRNAs are predicted to use both the D box and D′ box guides to base pair to two separate, but closely positioned targets within the rRNA. Whether both or only one of these sites is methylated has not been investigated. It may be that only one of the targets is actually methylated and the second is used only to stabilize the complex.

Similar to the situation observed in S. cerevisiae and E. coli for all types of nucleotide modification, the P. horikoshii 3D maps show that ribose methylation sites are largely concentrated in subunit regions that are free of protein. Whereas many sites correlate with regions known, or reasonably expected to be important for ribosome function, several others do not. In the small subunit structure (Fig. 5A and B) most of the methylated sites are clustered toward the interface where the subunits join and the decoding activities occur, but are absent from the central part of helix 44, which forms a significant portion of the interface with the large subunit. A similar pattern is detected in the small subunit of S. cerevisiae, but not for the E. coli small subunit where only one such modification occurs (Decatur and Fournier, 2002). Half of the eight methylated sites in the head-portion are clustered immediately above the region where the tRNA anticodon arm contacts the small subunit, in the A, P and E sites. The midsection is densely populated with 29 modifications (Fig. 5A). This part of the subunit includes the platform that supports the tRNAs and mRNA during decoding. Finally, seven methylations are predicted to occur at the base of the body (lower portion) and five of these are part of a bridge that contacts the base of the large subunit, below the sarcin-ricin loop region (Yusupov et al., 2001).

Most of the methylation sites in the large subunit occur in regions that surround the peptidyl transferase centre and the areas where the tRNA acceptor arm makes contact with the large subunit (Fig 5C and D). In contrast, the region where translation factors approach the A-site in the large subunit is relatively devoid of ribose methyl modified sites, as was previously observed for yeast (and E. coli, where three of four such ribosomal methylations are in the large subunit; Decatur and Fournier, 2002). The area immediately adjacent to the tRNA 5′ and 3′ends contains fewer methylations than yeast, but the total number of methylations predicted for the active site is comparable to that known to occur in the yeast and human large subunit rRNAs. A very intriguing aspect of the distribution of ribose methylated sites in P. horikoshii large subunit is that several methylated positions actually line the upper end of the lumen of the polypeptide exit tunnel (Fig. 5E). In yeast, the RNA forming the walls of the upper end of the polypeptide exit tunnel is heavily modified, but the methylations that extend beyond the predicted site of peptide bond formation are not in position to contact the nascent peptide chain. Perhaps these modifications modulate interaction between the ribosome and the nascent polypeptide (Sachs and Geballe, 2002).

The rRNA of P. horikoshii is predicted to contain almost double the number of methyl modifications of S. cerevisiae (i.e. 102 sites versus 55) and these modifications have a broader global distribution. As ribose methylation has an overall stabilizing role, the higher content and more broad distribution pattern of these modifications in extreme thermophiles such as P. horikoshii may reflect a more stringent requirement for stabilizing factors during folding of the rRNA and interaction with proteins, and during ribosome function.

Archaeal tRNATrp: another connection between modification and processing

A global analysis of the guide sequences in all available archaeal C/D box sRNAs, predicts that many of these sRNAs target methylation to 23 different sites of the pretRNAs (Dennis et al., 2001). In some instances a single sRNA directs methylation to only a single species of tRNA whereas in other instances a single sRNA can direct methylation to up to 19 different tRNAs, based on conservation of the target complementarity within the tRNA sequences (Dennis et al., 2001). The most frequently targeted position is C34 or U34, the wobble position in the anticodon of the tRNA. This position is known to be methylated in many tRNAs and is believed to contribute to specificity and stability in the codon–anticodon interaction (Satoh et al., 2000). In one particularly interesting example, the C/D box sRNA that guides modification of C34 and U39 in tRNATrp, is intronically encoded.

The tRNATrp gene in many euryarchaeal species contains an intron that has properties of a C/D box sRNAs (Armbruster and Daniels, 1997; Omer et al., 2000; Dennis et al., 2001). Daniels and co-workers have shown that the Haloferax volcanii intron is essential for tRNA maturation, and suggested a model that links modification and processing (Armbruster and Daniels, 1997). In the model, guide sequences within the intron are required for methylation of positions C34 and U39 within the exon regions of the precursor tRNA; at least two distinct structural rearrangements are required to generate the appropriate guide-target duplex for each methylation. Following methylation, a third structural rearrangement is required to generate the bulge–helix–bulge motif recognized by the intron endonuclease. After intron excision, the exons of the tRNA are ligated together by an uncharacterized ligase activity.

The general features of this model have now been elegantly confirmed (d’Orval et al., 2001). A S100 cell free extract from H. volcanii, when provided with a pretRNATrp transcript and S-adenosyl-methionine can carry out both the methylation reactions at C34 and U39, as well as other non-RNA guided modifications, and the excision reactions required to remove the intron. Various internal deletions of the intron were constructed and tested in order to identify the sequences necessary for the modification reactions and their relationships to intron excision. A deletion mutation that removed the D box guide necessary for methylation of C34, was defective in methylation at both C34 and U39 and the intron was not removed from the pretRNA. In contrast, a deletion of the D′ and D′ (a second copy of the D′) boxes and the associated guides was defective only in methylation of position U39. These and previous results suggest that methylation at position C34 is critical and may be required for subsequent methylation at C39 and for excision of the intron by the bulge–helix–bulge endonuclease. This is in contrast to the situation present in eukaryotes where methylation at position C34 occurs strictly after the intron removal in a RNA-independent, protein catalysed process (Pintard et al., 2002).

Other links between splicing, processing and modification

Several archaeal tRNA genes contain short introns, usually within the anticodon loop. The intron-exon junctions fold into a highly structured motif that consists of two three-base loops on opposite strands of a helix and separated by four base pairs (termed the bulge–helix–bulge motif; Diener and Moore, 1998). Maturation requires intron excision by the bulge–helix–bulge endonuclease followed by the ligation of the tRNA halves and the circularization of the spliced intron by an unidentified ligase (Kelman-Leyer et al., 1997). Most archaeal prerRNA transcripts contain the same bulge–helix–bulge motif within the 16S and 23S processing stems. These are also substrates for the intron excision endonuclease; cleavage within the motifs releases 16S and 23S precursors from the rRNA primary transcript (Dennis, 1997). Are the products of prerRNA processing also substrates for the elusive ligase? Early evidence on the commonality between cleavage and exon-splicing in rRNA and tRNA came from studies involving several hyperthermophilic archaea where the 23S rRNA gene contains an intron which is excised and subsequently ligated (Kjems and Garrett, 1988; Kjems and Garrett, 1991; Lykke-Andersen and Garrett, 1994).

A recent screen in Archaeoglobus fulgidus and Sulfolobus solfataricus for novel small non-mRNA species (snmRNAs) led to the identification of several new species of stable RNAs that are derived form the transcribed spacers of prerRNA (Tang et al., 2002a). In A. fulgidus (Afu) the rRNA primary transcript contains a tRNAAla gene located between the 16S and the 23S rRNA processing stems whereas S. solfataricus lacks this inserted tRNA gene. Upon cleavage by the BHB endonuclease, fragments of rrn spacer sequences derived from upstream and downstream of the cleavage sites are ligated together in a reaction similar to tRNA exon ligation. Interestingly, these spacer ligation products contain recognizable C and D motifs, raising the possibility that they are novel variants of C/D box sRNAs. They differ from canonical C/D box sRNAs in that they exhibit longer spacing between boxes and a different overall secondary structure (Fig. 6). In this structure, supported by chemical probing, boxes C′ and D, rather than C and D, are brought into close proximity and are predicted to form a K-turn motif. Both the A. fulgidus and the S. solfataricus snmRNAs are able to bind aL7a protein, but the affinity of the protein for RNAs is several-fold lower than for a canonical P. abyssi C/D box sRNA. As these snmRNAs lack recognizable guide sequences it seems likely that they function in some capacity other than targeting rRNA ribose methylation. What might this function be?

Details are in the caption following the image

Proposed secondary structure of the A. fulgidus 16S-D ligated RNA. During prerRNA processing the pre16S is removed by staggered endonuclease cleavages within the bulge–helix–bulge motif and the 5′ ETS sequence is ligated to the 3′ ITS sequence. Characteristic box features (C, D′, C′ and D) of this ligation product are highlighted. Location of the excised 16S rRNA precursor (i.e. the site of ligation) is indicated in dotted line.

In all organisms the small subunit rRNA contains a universally conserved pseudoknot structure that tethers the loop of the 5′-terminal helix and the internal connector region between the central and major 3′-terminal domains. It has been suggested that the formation of this pseudoknot structure in eukaryotic small subunit ribosomes is mediated by the U3 snoRNA (Hughes, 1996). In this model the 5′end of U3 base-pairs with the rRNA and facilitates correct positioning of the elements involved in the pseudoknot formation. Prokaryotes lack a U3 homologue, but at least in some examples, the function of this snoRNA might be substituted in cis, by a highly conserved U3-like sequence in the 5′ETS region of the pre-RNA transcript (Dennis et al., 1997). Although a portion of the snmRNAs from A. fulgidus and the S. solfataricus overlap the region of the U3-like sequence found in other rrn operon spacers, they do not possess the clear signature sequences common to U3 snoRNAs. Nonetheless, it is possible that these snmRNAs play a role in prerRNA processing or assembly of ribosomal particles (Tang et al., 2002a).

Another aspect of the archaeal sRNA variants that remains unclear is the nature of the nucleotide protruding from the kink turn motif into a pocket of the aL7a protein. It was established that steric hindrance and specific contacts with key amino acid residues restrict this nucleotide to U (Klein et al., 2001). In the A. fulgidus 16S-D snmRNA this nucleotide is a G, making it difficult to be accommodated in the aL7a protein binding pocket. This might explain the lower affinity of the aL7a protein for the 16S-D snmRNA. Another possibility is that there might be a parologue to aL7a, within A. fulgidus with specificity for a different base.

H/ACA based pseudouridylation functions in Archaea

In contrast to eukaryotes, archaeal rRNA contains only a few pseudouridine residues; this is comparable to the number found in bacterial rRNAs where modification occurs without the use of guide RNAs. However, archaeal genomes encode proteins with apparent homology to the eukaryotic H/ACA associated proteins. Using an A. fulgidus cDNA library prepared form a pool of cellular RNAs ranging between 50 and 500 nucleotides in length, four candidates were identified that have the box H and ACA features characteristic of eukaryotic pseudouridylation guide RNAs (Tang et al., 2002b). Compared to eukaryotic H/ACA snoRNAs, the archaeal sRNAs exhibit a greater variability in the overall length and secondary structure, and contain one or three hairpin structures, which each harbour a pseudouridylation pocket (Fig. 7). Within these sRNAs the characteristic spacing of 14–16 nucleotides between the ACA or H box and the target site for uridine isomerization is maintained. Moreover, from a total of six predicted sites of pseudouridylation in A. fulgidus rRNA, five sites have been confirmed using primer extension analysis (Tang et al., 2002b; Rozhdestvensky et al., 2003). Similar to the C/D box sRNAs, the H/ACA sRNAs are likely ubiquitous in Archaea, as demonstrated by their occurrence in three Pyrococcus species and in Methanococcus jannaschii. The discovery of archaeal box H/ACA sRNAs, expands the repertoire of common features shared between the archaeal and the eukaryotic rRNA modification apparatus.

Details are in the caption following the image

Proposed secondary structure of archaeal H/ACA box sno-like RNAs. The secondary structures of the A. fulgidus Afu-4 and Afu-46 are depicted in the consensus style of eukaryotic H/ACA guide snoRNAs. The H and ACA/AGA motifs are boxed and the guide region complementarity to rRNA is highlighted with a solid line. The Afu-4 sRNA guides pseudouridylation at positions 1167 in 16S and 2601 and 1364 in 23S rRNA, and Afu-46 sRNA guides pseudouridylation at position 2639 in 23S rRNA. Predicted K-turn motifs are boxed (adapted from Rozhdestvensky et al., 2003).

Interestingly, in Archaea the terminal stem–loop structure present at the top of the H/ACA sRNA hairpin can be folded in a typical K-turn structure; preliminary data indicate that, similar to the C/D box sRNAs, this K-turn motif is involved in the binding of the aL7a protein to the H/ACA sRNAs (Rozhdestvensky et al., 2003; S. Ziesche, unpubl. data).

Perspectives

The last few years have seen a rapid expansion in the identification and functional description of archaeal small RNAs. Remarkably, almost all of these are related to the plethora of small RNAs that form the core of the complex nuclear RNP machinery responsible for eukaryotic ribosome biosynthesis. The sharing of these features between eukaryotes and Archaea demonstrates that these RNP-mediated processes are of ancient origin and predate the divergence of the two lineages. Although a list of the component parts of these machines is beginning to emerge, little is presently known about how the parts fit together and how the machines function. Archaeal systems offer some promise for addressing these structural and biochemical issues – these cells are relatively simple and a useful source for purification of complexes, and the complexes, when derived from thermophiles or halophiles, have a higher likelihood to be sufficiently stable to retain function in vitro and be amenable to disassembly and reassembly. If life evolved through an intermediate that used RNAs as the repository of genetic information and ribozymes to mediate essential life processes, then it seems likely that our current collection of sRNAs and our understanding of sRNA function may be only the tip of a much larger iceberg.

Acknowledgements

We thank Alexander Huttenhofer for sharing results prior to publication and for helpful suggestions. This work was supported by a grant from the Canadian Institute for Health Research to P.P.D. and the National Institutes of Health (GM19351) to M.J.F.

    The full text of this article hosted at iucr.org is unavailable due to technical difficulties.