Article

15 December 1998

Structure of the foot‐and‐mouth disease virus leader protease: a papain‐like fold adapted for self‐processing and eIF4G recognition

Alba Guarné, José Tormo, Regina Kirchweger, Doris Pfistermueller, Ignasi Fita, and Tim Skern [email protected]Author Information

The EMBO Journal

(1998)

17: 7469 - 7479

The leader protease of foot‐and‐mouth disease virus, as well as cleaving itself from the nascent viral polyprotein, disables host cell protein synthesis by specific proteolysis of a cellular protein: the eukaryotic initiation factor 4G (eIF4G). The crystal structure of the leader protease presented here comprises a globular catalytic domain reminiscent of that of cysteine proteases of the papain superfamily, and a flexible C‐terminal extension found intruding into the substrate‐binding site of an adjacent molecule. Nevertheless, the relative disposition of this extension and the globular domain to each other supports intramolecular self‐processing. The different sequences of the two substrates cleaved during viral replication, the viral polyprotein (at LysLeuLys↓GlyAlaGly) and eIF4G (at AsnLeuGly↓ArgThrThr), appear to be recognized by distinct features in a narrow, negatively charged groove traversing the active centre. The structure illustrates how the prototype papain fold has been adapted to the requirements of an RNA virus. Thus, the protein scaffold has been reduced to a minimum core domain, with the active site being modified to increase specificity. Furthermore, surface features have been developed which enable C‐terminal self‐processing from the viral polyprotein.

Introduction

The structures of a number of viral proteases have been determined recently (Babé and Craik, 1997). Comparison of these viral structures with those of prototypes of the four classes of proteases reveals similarities, but also distinct differences in mechanism and structure. Thus, the picornaviral 3C cysteine protease and the NS3 serine protease of hepatitis C virus have a similar fold to the serine protease trypsin (Allaire et al., 1994; Matthews et al., 1994; Kim et al., 1996; Love et al., 1996). In contrast, the cytomegalovirus protease possesses a single β‐barrel structure, instead of the classic two β‐barrel motif of trypsin, and a novel catalytic triad of His/His/Ser (Chen et al., 1996; Tong et al., 1996). Furthermore, the adenovirus protease, a viral cysteine protease, also has a novel fold, although it employs a catalytic triad Asp/His/Cys reminiscent of that found in the cysteine protease papain (Ding et al., 1996). Papain‐like cysteine proteases are characterized by a tryptophan residue following the nucleophilic cysteine, and two hydrophobic residues following a conserved histidine which increases the nucleophilic character of the catalytically active cysteine (Berti and Storer, 1995). Indeed, a number of viral proteolytic enzymes have been proposed to be papain‐like cysteine proteases, based mostly on the presence of such conserved patterns (Gorbalenya et al., 1991). However, the lack of direct structural information on these viral cysteine proteases has prevented the establishment of further relationships between papain and its putative viral relatives.

The leader protease (L^pro) of foot‐and‐mouth disease virus (FMDV), an animal pathogen of global importance, is a proposed papain‐like viral cysteine protease. Even though sequence identity between the FMDV L^pro and papain is no higher than 15% (Gorbalenya et al., 1991; Skern et al., 1998), the characteristic residues of papain‐like proteases are conserved in the primary sequences of the L^pro from both FMDV and the related equine rhinoviruses (ERVs). L^pro is the first protein encoded on the FMDV polyprotein (Figure 1A). Its sole role in viral maturation is to free itself from the polyprotein by cleavage between its own C‐terminus and the N‐terminus of VP4 (Figure 1B) at the sequence ArgLys LeuLys↓GlyAlaGlySer. Theoretically, this self‐processing event can occur either intra‐ or intermolecularly (Figure 1B); expression of the L^pro in various in vivo and in vitro systems has provided evidence for both types of reaction (Belsham et al., 1990; Medina et al., 1993; Piccione et al., 1995b). Nevertheless, it has not yet been established which of the two mechanisms for self‐processing is preferred in vivo. As initiation of protein synthesis on the FMDV genome occurs at one of two AUG codons lying 84 nucleotides apart, two species of L^pro (known as Lab^pro and Lb^pro, depending on whether protein synthesis initiates at the first or second AUG codon, respectively) have been identified in infected cells and have been shown to possess the same enzymatic properties (Medina et al., 1993). In addition to Lb^Pro, FMDV encodes two other proteolytic activities, the 2A peptide which has been proposed to cleave autocatalytically between its own C‐terminus and the N‐terminus of 2B, and the 3C protease (Ryan and Flint, 1997). This enzyme carries out all remaining cleavages, with the exception of that between VP4 and VP2 which occurs in the maturing capsid by an as yet unknown mechanism.

Figure 1. Schematic drawing of the biological activities of L^pro. (A) The RNA genome of FMDV. The single open reading frame is shown as an open box (with the mature viral proteins indicated), non‐coding regions by a line and the IRES with a closed box. L^pro forms are stippled. (B) Protein synthesis on FMDV mRNA; L^pro self‐processing is indicated as an intra‐ or intermolecular event. (C) Role of eIF4 proteins in initiation of protein synthesis and cleavage of eIF4G by Lb^pro. eIF4 proteins involved in protein synthesis and the 40S ribosomal subunit are indicated. The m⁷GDP 5′ cap structure of cellular mRNAs (open circle) and the Lb^pro cleavage site (arrow) are indicated. The eIF4G C‐terminal domain still forms an initiation complex with IRES‐containing mRNAs.

Download figure

Download PowerPoint

After the single self‐processing event, L^pro then plays a central role in FMDV replication by specifically cleaving the host cell protein eukaryotic initiation factor eIF4G at the sequence AlaAsnLeuGly↓ArgThrThrLeu (Figure 1C; Kirchweger et al., 1994). As a result, the domain of eIF4G which binds the cap‐binding protein eIF4E is separated from the domain of eIF4G which binds eIF3, so that the infected cell is unable to recruit its own capped mRNA to the 40S ribosome. The viral mRNA is unaffected as it initiates protein synthesis internally via an IRES (Figure 1C). This shuts off cellular mRNA translation in favour of the viral counterpart.

We report here the three‐dimensional structure of two variants of the FMDV Lb^pro, present their relationship to the papain superfamily and discuss mechanisms of self‐processing and the basis of substrate recognition for specific cleavage on two different proteins.

Results and discussion

Structure determination

Wild‐type Lb^pro crystals could not be obtained. However, substitution of the active site nucleophile Cys51 (Piccione et al., 1995a; Roberts and Belsham, 1995; Ziegler et al., 1995) with Ala allowed crystals to be grown as reported (Guarné et al., 1996). These contained eight Lb^pro molecules in the asymmetric unit but were difficult both to reproduce and to manipulate. Thus, although a native diffraction data set from these crystals was obtained up to 3.0 Å resolution, the corresponding phases could not be determined. We speculated that the properties of these crystals resulted from interactions between the C‐terminus of one molecule and the active site of a neighbour, and therefore prepared a truncated form of the inactive Lb^pro lacking six C‐terminal amino acids (termed sLb^pro). New crystals were obtained containing two sLb^pro molecules in the asymmetric unit and the structure was solved by a combination of isomorphous replacement and density modification techniques. The Lb^pro Cys51Ala mutant was then determined by molecular replacement using the sLb^pro structure as a search model (Table I; see Materials and methods).

Table 1. X‐ray parameters and refinement statistics

Lb^pro		sLb^pro
Lb^pro		Native	HgCl₂	K₂PtCl₄
Data collection ^#
Space group	P2₁2₁2₁	C222₁	C222₁	C222₁
Resolution (Å)	25−3.0	25−3.0	25−2.7	25−2.8
No. of unique reflections	35 080	7339	9335	9083
Completeness (%)	92 (75)	84 (90)	82 (89)	94 (90)
<I/σI>	9.5 (3.0)	9.9 (4.4)	9.9 (7.4)	7.4 (4.1)
Multiplicity	7.7	6.0	8.3	9.6
R_sym (%)	10.3 (26.3)	9.3 (19.9)	8.5 (17.2)	9.7 (19.1)

MIR phasing (25.0−3.0 Å)^a
R_iso (%)	32.3 (38.4)	23.4 (30.3)
Sites	5	2
Phasing power	1.57	1.49
R_Cullis	0.68	0.79
Mean figure of merit	0.37	0.37
		0.37
Model refinement (20.0−3.0 Å)
R_factor % (R_free, %)	25.8 (29.6)	23.5 (28.8)
No. of reflections	34 977	7320
Total no. of non‐hydrogen atoms	11 190	2565
No. of solvent molecules	3	20
R.m.s. deviation in bonds (Å)	0.005	0.005
R.m.s. deviation in angles (°)	1.2	1.3
R.m.s. deviation in temperature (bonded atoms, Å²)	2.4	1.6

Anisotropic correction
(a^; b^, c^*; Å²)	15.6; −16.9; 4.3	−1.3; 10.8; 9.2
Average B‐factor (Å²)	38.8	25.6

Numbers in parentheses correspond to the highest resolution shell (3.05−3.00 Å).

R_sym = Σ|I − <I>|/ ΣI, where I = observed intensity and <I> = averaged intensity obtained by multiple observations of symmetry related reflections.

R_iso = Σ|(|F_PH| − |F_P|)/Σ|F_P|, where |F_P| = protein structure factor amplitude and |F_PH| = heavy‐atom derivative structure factor amplitude.

R_cullis = Σ∥F_PH ± F_P∥−F_H(calc)∥/Σ|F_PH ± F_P| for all centric reflections.

Description of the overall Lb^pro structure

The Lb^pro structure (Figure 2A–C) presents a compact, globular region, ranging from Met29 to Tyr183 with an overall cubic shape of approximate edge dimensions of 30 Å, from which a flexible C‐terminal extension (CTE) ranging from Asp184 to Lys201 extrudes. The globular region is divided into two subdomains, with the catalytically essential residues Cys51 (replaced by Ala in the Lb^pro structure) and His148 located at the interface. The first, N‐terminal subdomain contains four α‐helices (α1, α2, α3 and α4) and two short antiparallel β‐strands (β1 and β2) comprising only residues Glu30–Thr32 and Lys38–Thr40, respectively. The longest α‐helices α1 and α3 (comprising residues Asn50–Glu64 and Leu78–Gly91, respectively) run perpendicular to each other, with the catalytic Cys51 being located towards the N‐terminal side of helix α1. The shortest helix α2 spans only six residues (Phe68–Ser73) and runs almost parallel to α3. The second Lb^pro subdomain displays a fold belonging to the all β‐family of proteins, as the only regular secondary structure elements are contained in a mixed β‐sheet formed by one parallel (β3 with β4) and six antiparallel (β4–β9) β‐strands (Figure 2A–C). The essential His148 is located on the turn connecting the longest strands β5 and β6 (residues Phe137–Leu143 and Ala149–Thr155) which occupy a central position in the sheet.

Figure 2. Overall view of the FMDV L^pro and its relationship to papain. (A) Structure‐based amino acid alignment from the indicated FMDV serotypes and ERV1. The alignments of FMDV serotypes are based on those of Ryan and Flint (1997), that of ERV1 is from Skern *et al*. (1998). Structurally equivalent residues in papain are shown. The N‐terminal positions of Lab^pro and Lb^pro, secondary structure elements and the start of the CTE are also indicated. Asterisks mark Asn46, Cys51, His148 and Asp163. (B and C) Views of the structure of the FMDV Lb^pro rotated 90°. α‐helices and β‐strands are coloured green and magenta, respectively. The active site Cys51 and His148 are shown as balls and sticks. The ordered CTE is in orange (referred to as CTE1), that containing the disordered amino acids (dashed lines) is in blue (referred to as CTE2). (D) Stereo drawing of superimposed C_α traces of Lb^pro (blue) and papain (yellow), using the standard view of papain (Kamphuis *et al*., 1985). The location of the prosegment‐binding loop (PBL) of papain is indicated (see text). [Figure 2B–D as well as all parts of Figures 3 and 4 were drawn using the programs MOLSCRIPT (Kraulis, 1991), with modifications by R.Esnouf (Esnouf, 1997), and RASTER3D (Merrit and Murphy, 1994).]

Download figure

Download PowerPoint

The main difference between the structures of Lb^pro and sLb^pro is that, in the latter, the 12 residues remaining in the CTE are disordered in both molecules present in the asymmetric unit (from residue Asp184 onwards). The superimposition of the globular regions of Lb^pro and sLb^pro models gives an r.m.s. deviation of 0.4 Å (performed using the program SHP; Stuart et al., 1979), which can be considered as an upper limit for the coordinate errors in the compact region of the two structures.

Several groups proposed a papain‐like fold for the picornaviral L^pro structure (Gorbalenya et al., 1991; Piccione et al., 1995a; Skern et al., 1998). The papain‐fold consists of left‐ (L) and right‐hand (R) domains [standard papain‐fold view (Kamphuis et al., 1985)] which are structurally equivalent to the two subdomains described for Lb^pro. Superimposition of main‐chain C_α atoms from Lb^pro with papain (Figure 2A and D) gives an averaged r.m.s. deviation of 1.3 Å for 76 equivalent residues. As the globular region of Lb^pro represents the smallest polypeptide fragment with a papain‐like topology, it lacks most of the decoration found in papain, including the prosegment‐binding loop (PBL; Coulombe et al., 1996; Figure 2D). Regions most conserved between the papain‐like proteases and Lb^pro structures are located around the active centre, especially secondary elements α1 and β5–β6, containing the catalytic cysteine and histidine residues (Figure 2B and C). Papain‐like proteases have, however, no equivalent to the Lb^pro CTE.

The extended conformations of the Lb^pro CTEs are stabilized by a network of intermolecular interactions originating from the exchange of the CTEs between neighbouring molecules. In four out of the eight Lb^pro molecules in the asymmetric unit, all residues in the CTE are visible (Figures 2C and 3) while, in the other four molecules, there is some disorder, and residues Glu186–Glu191 have not been traced (Figures 2C and 3). In the four molecules containing the disorder, the conformation of the visible residues is closely related to, though subtly different from that found in the other four (Figures 2B, C and 3). The 10 C‐terminal amino acids, residues Trp192–Lys201, are nevertheless well defined in all eight crystallographically independent subunits and present identical interactions with the substrate‐binding pocket of adjacent molecules in the crystal.

Figure 3. Stereo view of the disposition of the Lb^pro molecules in the crystal. The eight Lb^pro molecules contained in the asymmetric unit are coloured. The orientation of the unit cell axis is also indicated. Four molecules, with the ordered CTEs, are shown in yellow, while the remaining four, with the disordered CTE, are shown in green. The corresponding CTEs are shown in orange and blue, respectively. Six intermolecular disulfide bridges, represented as thick balls, can also be seen, four of them with molecules of adjacent asymmetric units (in grey). The quasipolymeric character of the packing, with extensive non‐covalent interactions alternating with disulfide bridges, is apparent. Two types of non‐crystallographic symmetries are present. The first (dashed red lines) relates pairs of neighbouring molecules, whilst the second (red spot) relates four molecules of the asymmetric unit with the other four.

Download figure

Download PowerPoint

Besides the exchange of the CTEs between neighbouring molecules, the crystal packing of both Lb^pro and sLb^pro forms shows a polymeric character brought about by two noteworthy peculiarities. First, there is a covalent intermolecular disulfide bridge between adjacent molecules (Figure 3) in both the sLb^pro and Lb^pro forms, although the spatial relationship between disulfide‐linked molecules differs in the two crystal forms. Secondly, there is a large contact area of 1088 Å² between the helical domains of the neighbouring molecules which are related by a local 2‐fold axis. Again, the dimer defined by the two subunits in contact is present in both the sLb^pro and the Lb^pro crystal forms. However, as Lb^pro functions enzymatically as a monomer, the biological relevance of both possible dimers is not clear.

The active site cleft

The active site containing the catalytic residues Cys51 and His148 is located on top of a deep cleft in the interdomain region, as observed for other members of the papain superfamily. Both the location of the active site and the spatial arrangement of the catalytic residues are well preserved (Figures 2B–D and 4A). In papain superfamily members, the active site histidine (P‐159; papain numbering is used throughout when describing papain‐like proteases) is maintained in the correct orientation with respect to the nucleophilic cysteine (P‐25) by a hydrogen bond to the side‐chain oxygen of a conserved asparagine residue (P‐175). Asp163 carries out this task in Lb^pro (Figure 4A and B), and this residue is strictly conserved in leader proteases (Figure 2A). In all members of the papain superfamily examined so far, a tryptophan residue (P‐177) covers the hydrogen bond formed between the Asn–His pair (Figure 4C); substitution of Trp P‐177 reduces papain activity (Berti and Storer, 1995). Neither this aromatic residue nor the 11 residue loop (P‐175– P‐185) that anchors it in the papain‐like enzymes is found in Lb^pro. In its place is a β‐turn containing a cluster of four acidic residues (Asp163, Asp164, Glu165 and Asp166; Figure 4B); these residues confer a strong local negative charge, so that the environment is quite different from those in most papain‐like enzymes (Figure 5). In the absence of the tryptophan residue, the carboxylate group of Asp163 may be required to form a stronger hydrogen bond with His148. Despite these differences, Lb^pro represents a papain‐like enzyme without this fully conserved tryptophan residue.

Figure 4. Active site of Lb^pro. (A) Arrangement of amino acid side chains around the active site (viewed down through the central helix α1) after superimposition of papain (grey) and Lb^pro (green) using the program SHP (Stuart *et al*., 1979). Catalytic residues of Lb^pro (Asn46, Cys51, His148 and Asp163) are in yellow, those of papain (Gln P‐19, Cys P‐25, His P‐159 and Asn P‐175) in grey. (B) Network of hydrogen bonds in the Lb^pro active centre. The acidic cluster (Asp163–Asp166) in the S′‐binding region is also shown. The orientation of the amide group of Asn46 is maintained by hydrogen bond interactions with the main‐chain nitrogen of Asp49 and the side chains of Asn54 and Asp164. (C) Hydrogen bonds in the papain active centre. The fully conserved Trp P‐177, in papain‐like enzymes, covering the hydrogen bond between essential catalytic residues Asn P‐175 and His P‐159 is also shown.

Download figure

Download PowerPoint

Figure 5. Electrostatic potential surfaces of, left to right, papain (Kamphuis *et al*., 1984; 1papM PDB code), Lb^pro and cathepsin L (Coulombe *et al*., 1996; 1aec PDB code). The yellow arrow indicates the Lb^pro active centre. The standard papain view is used in the top panel; that in the lower panel, obtained by a simple 90° rotation of the upper one, is down the α1 helix. The apparent differences in charge distribution reflect the indicated differences in isoelectric points, but also the different specificity of S and S′ subsites. [Figures 5 and 7 were generated by GRASP (Nicholls *et al*., 1991)].

Download figure

Download PowerPoint

Another important catalytic residue in papain superfamily members is a conserved glutamine (P‐19) whose side‐chain amide, together with the main‐chain nitrogen of the catalytic cysteine (P‐25), stabilizes the negative charge developing on the scissile carbonyl oxygen during nucleophilic attack. This structural feature, termed the oxyanion hole, is also present in Lb^pro. However, Asn46 replaces the conserved glutamine residue (Figure 4B and C). In Lb^pro, the arrangement of the turn positioning Asn46 and the orientation of its side‐chain amide [flipped by 180° compared with the equivalent glutamine (P‐19) in all other papain‐like proteases] differ from those of other members of the papain superfamily (compare Figure 4B and C). This is probably due to the fact that, in Lb^pro, only four residues separate Asn46 from Cys51, whereas five residues are found in other papain‐like proteases. The shorter side chain of Asn46 and its orientation are required because the tighter turn of the Lb^pro brings the main chain closer to the catalytic residues than in other papain‐like enzymes. The orientation of the side chain of Asn46 is fixed by hydrogen bonds to the main‐chain nitrogen of Asp49, forming an Asn‐pseudoturn, and the side‐chain carboxylate of Asp164 (Figure 4B). This aspartate residue, conserved in all Lb^pro sequences analysed so far (Figure 2A), is located immediately after the catalytic Asp163 in the acidic loop described above, and participates in an intricate network of hydrogen bonds involving residues Asn46, Asp49, Asn54 and Asp164 (Figure 4B). This network might contribute to the stability of the active site structure and catalytic activity.

The interaction between L^pro and its C‐terminus

FMDV L^pro frees itself from the growing polypeptide chain by specific cleavage at its own C‐terminus (Figures 1A,B and 2A). Thus, the presence of CTE residues inside the substrate‐binding pockets of adjacent molecules illustrates substrate recognition during self‐processing and represents, in fact, the P side of the substrate in the self‐processing reaction. The peptide backbone of the final residues of the CTE is in an extended conformation (Figure 3) similar to that observed in complexes of enzymes of the papain superfamily with peptide‐like inhibitors (Yamamoto et al., 1991, 1992).

The main interactions between the CTE and the substrate‐binding site (Figure 6) are provided by Lys201′ and Leu200′, with minor contributions from residues Lys199′–Val196′ (residues in the CTE of a symmetry‐related molecule are labelled with primed numbers). The final CTE residue, Lys201′, is positioned close to the active site Cys51 (replaced by Ala in this structure) as if catalysis had been completed. One of its carboxylate oxygens, located in the oxyanion hole, is hydrogen bonded to the side chain of Asn46 and the main‐chain nitrogen of Cys51, whereas the second carboxylate oxygen accepts a hydrogen bond from the imidazole ring of the catalytic His148 (Figure 6), expected to be protonated as described for papain (Yamamoto et al., 1991; Brocklehurst et al., 1998). The CTE establishes additional interactions with the substrate‐binding site through hydrogen bonds between main‐chain atoms. Thus, the main‐chain nitrogen of Lys201′ forms a hydrogen bond with the main‐chain carbonyl of Glu147; the main‐chain oxygen and nitrogen atoms of Leu200′ are hydrogen bonded to the main‐chain nitrogen and oxygen, respectively, of Gly98, building a short antiparallel β‐sheet. These hydrogen bond interactions are a conserved feature in substrate binding by papain superfamily members.

Figure 6. Stereo view of the interactions between the CTE and the S site. (A) Residues involved in the formation of subsites S₁–S₆ are shown as balls and sticks, with nitrogen atoms coloured in blue and oxygen atoms in red. Residues of the CTE are also shown and labelled Val196′–Lys201′. (B and C) Residues involved in the formation of the S₁ and S₂ subsites, respectively, are shown as balls and sticks with their corresponding electron density in light blue. The P₁ (Lys201′) and P₂ (Leu200′) residues of the CTE are also shown.

Download figure

Download PowerPoint

The S₁ subsite in Lb^pro is a narrow cleft bounded by the loop preceding the central helix α1 on one side, the β‐turn connecting strands β5 and β6 on the other side and the active site at the bottom (Figures 2B and 6A,B). In the Lb^pro structure, the aliphatic portion of the side chain of Lys201′, which occupies the S₁ subsite, is sandwiched between the main chain of residues His95–Glu96 and the side chain of Glu147, while its amino group establishes electrostatic interactions with the carboxylates of Glu96 and Glu147 (Figure 6A). The S₁ subsite in papain and other family members is a wide, unrestricted pocket which exerts relatively little influence on the substrate specificity (Figure 5). In Lb^pro, which clearly prefers lysine at P₁ in the self‐processing reaction (Figure 2A), this subsite has become narrower and deeper due to a rearrangement of the loop connecting strands β5 and β6 on the R domain. Amino acid sequence alignments and modelling of the ERV1 L^pro imply a correlation between the side chain at P₁ and that of residue 147. Thus, FMDV enzymes, with lysine in P₁, have a negatively charged glutamate at position 147; in contrast, the corresponding residue in ERV1, with serine at P₁, is Gly149, shorter and not charged (Figure 2A).

The side chain of Leu200′ (P₂) is completely buried in a hydrophobic S₂ pocket formed by Trp52, Gly97–Pro100, Leu143, Glu147–Ala149 and Leu178 (Figure 6A and C). The architecture of the Lb^pro S₂ subsite is very similar to that of other papain superfamily proteases; in fact, the residues defining the pocket in papain are identical to those in Lb^pro, with the exception of Leu143 which is equivalent to Val P‐133.

CTE residues Lys199′ (P₃) and Arg198′ (P₄) occupy loose pockets on opposite faces of the cleft, in the S₃ and S₄ subsites, respectively. The aliphatic portion of the side chain of Lys199′ makes van der Waals contacts with main‐chain atoms of residues Gly97–Gly98, and its amino group interacts through a hydrogen bond with the main‐chain carbonyl group of Glu93 and through weak ionic interactions with the side‐chain carboxylates of Glu93 and Glu96. Arg198′ makes van der Waals contacts with Gly98, Pro99, Leu143 and extensively with Gln146. Its guanidinium group also hydrogen‐bonds to the amide side chain of Gln146. Residue Gln197′ (P₅) has its side chain exposed to the solvent, but still contacts through its main chain Pro99, thus making a very open subsite S₅. Finally, Val196′ is buried in a hydrophobic cavity (subsite S₆ formed by residues Pro99, Ala101, Val127 and Leu178), located on the interdomain cleft, just underneath subsite S₂.

Biological implications of the structure

Self‐cleavage at the C‐terminus. The presence of the CTE in the active site of adjacent molecules argues for intermolecular self‐processing. However, although in the crystal structure of Lb^pro the CTE projects away towards neighbouring molecules, instead of folding back into its own substrate‐binding cleft, several structural features suggest that self‐processing in cis is possible and might even be favoured. First, the interface between the globular domains exchanging their CTEs is composed of weak interactions, indicating that this region is not designed to promote an intermolecular reaction. Secondly, residues located immediately after Tyr183, at which point the polypeptide chain leaves the globular region to begin the CTE, favour a turn. Notably, Asp184 and Glu186 are conserved in all serotypes of FMDV and in ERV1, enabling the placement of a conserved hydrophobic residue (Leu188) into a shallow hydrophobic pocket formed by residues Ala118, Pro121, Thr130, Met132 and Cβ of Asp136. This interaction leads the CTE polypeptide in a direction compatible with both cis and trans self‐processing. A polar, highly flexible stretch (residues Asn189–Glu191; modelled in Figure 7) should overcome the distance between this pocket and the above‐described subsites in the substrate‐binding cleft. A tryptophan residue at position 192 (or the aromatic residue found in ERV1 Tyr201) would enhance self‐processing in cis by stacking its aromatic ring with the exposed and conserved Trp105 (ERV1 Tyr103) of the globular domain. Thus, the CTE would reach subsite S₆ (interaction with Val196) with minor rearrangements of the main chain (Figure 7); the electrostatic and van der Waals interactions of the CTE with the substrate‐binding cleft would maintain the correct orientation for the self‐processing in cis.

Figure 7. Model for self‐processing of the Lb^pro *in cis*. The electrostatic potential surface of Lb^pro with CTE residues Asp184–Asn189 and Lys195–Lys201 as sticks. Coordinates from Lys195 to Lys201 correspond to a neighbouring molecule in the crystal. However, the orientation and the proximity of this fragment to Asn189 points towards a simple, direct connection within the same molecule of Lb^pro. This joining is indicated by dashes in the left panel and modelled by explicitly including residues Gly189–Ala194 in the right one. Polar amino acids would be exposed to the solvent, and only Trp192, pointing towards Trp105, appears to require minor rearrangements of the flexible and highly exposed Arg109. The green arrow indicates the position of the side chain of Cys133.

Download figure

Download PowerPoint

Why are intramolecular CTE interactions not observed in the Lb^pro crystal structure? First, the path of the CTE to its ‘own’ active site is partially blocked by the intermolecular disulfide bridge between neighbouring molecules. This bond is not believed to be present in the reducing environment inside the cell. Secondly, crystal packing requirements may have also favoured the observed CTE interactions. Finally, the ability to cleave eIF4G requires the CTE to be flexible and not remain in the active site. Evidence for the flexible nature of the CTE is provided by the lack of density in the sLb^pro form and the differences in the positions of certain residues in the disordered form of the CTE. The disorder in the CTE also appears to be energetically favoured, as freezing the polar CTE in a fixed conformation would impose a high entropic penalty.

Cleavage of eIF4G. The tendency of the CTE to leave the active site of the same polypeptide chain after cis cleavage is of significance for the recognition and cleavage of eIF4G by Lb^pro. Thus, the enzyme cannot be inhibited by binding to its own C‐terminus, which served as the recognition site for the cleavage on the viral polyprotein. However, as indicated above, this implies that the S site does not provide sufficient interactions to maintain the substrate in the active site; indeed, the absence of significant intramolecular product inhibition is probably a direct result of this inability. Thus, to bind to its cleavage site on eIF4G, which lacks a basic P₁ residue, the enzyme appears to employ the acidic patch of the S′ site (Figure 5) to provide an ionic interaction with the P′₁ Arg residue of the cleavage site on eIF4G. Indeed, it is noteworthy that all intermolecular substrates of Lb^pro identified so far in vitro contain basic residues at P′₁ or P′₂ or both, even when the P₁ residue is basic (data not shown). Taken together, these observations suggest that a substrate containing basic residues at both P₁ and P′₁ should be an optimized substrate for the Lb^pro. As, however, no data are available on the sequence preference of Lb^pro on peptide substrates, experiments are underway to investigate this notion.

The acidic patch at the S′ site of Lb^pr°, coupled with the narrow cleft traversing the active site (Figures 5 and 6A), also appear to be the reasons why the Lb^pro is clearly much more specific than papain, although the two enzymes possess an S₂ pocket almost identical in composition and topology. Thus, Lb^pro does not cleave an immunoglobulin molecule, a classic substrate of papain. Furthermore, although eIF4G is an efficient substrate for papain, the cleavage products are not the same as those of Lb^pro (B.Hampoelz and T.Skern, unpublished).

Summary

The FMDV Lb^pro is the first crystal structure determined of a viral papain‐like cysteine protease. The structure shows the interactions involved in C‐terminal processing which enable cleavage between a lysine and a glycine residue to take place. In addition, examination of the substrate‐binding cleft indicates how the enzyme can carry out the specific cleavage of the host protein eIF4G which, in contrast, requires cleavage between a glycine and an arginine residue.

Materials and methods

Protein expression, purification and crystallization

The Cys51Ala mutant of the FMDV serotype O_1k Lb^pro was expressed in Escherichia coli BL21 (DE3) pLysS and purified as described (Kirchweger et al., 1994). A variant of the Cys51Ala mutant lacking six amino acids at the C‐terminus (sLb^pro) was expressed and purified similarly. Lb^pro crystals, belonging to space group P2₁2₁2₁ with cell dimensions a = 65.4 Å, b = 101.6 Å and c = 277.0 Å, were obtained by vapour diffusion against solutions containing 10% PEG 6000, 0.8 M MgCl₂, 0.1 M Tris–HCl, pH 8.5. There are eight molecules in the asymmetric unit and an estimated solvent content of 59%. sLb^pro crystals, belonging to space group C222₁ with unit cell dimensions of a = 51.0 Å, b = 130.0 Å and c = 126.2 Å, were grown from 8% PEG 4000, 0.2 M MgCl₂, 0.1 M Tris–HCl, pH 8.5. This crystal form has two molecules in the asymmetric unit and an estimated solvent content of 55%. sLb^pro crystals were harvested in solutions containing 10% PEG 4000, 0.2 M MgCl₂, 0.1 M Tris–HCl, pH 8.5, prior to data collection or heavy atom screening.

The two heavy atom derivatives of the sLb^pro crystal used for phase determination were prepared by soaking at room temperature for 48 h in the harvesting solution containing 0.1 mM HgCl₂ or for 24 h in the harvesting solution which had been adjusted to 10 mM K₂PtCl₄.

Data collection

For cryogenic X‐ray data collection, native and derivative crystals were soaked in harvesting solutions made up to 25% ethylene glycol and flash‐frozen under a stream of boiled‐off nitrogen at 100 K (Oxford CryoSystems). X‐ray diffraction data sets were collected on MarResearch image plate detector systems using a Rigaku RU‐200B rotating anode and synchrotron facilities at the X11 EMBL outstation (DESY, Hamburg). Data were indexed, reduced, scaled and merged with DENZO and SCALEPACK (Otwinowski and Minor, 1997) (Table I). Most subsequent calculations were performed with the CCP4 Program Suite (Collaborative Computational Project Number 4, 1994).

Structure determination

The structure of the sLb^pro crystal form was determined by multiple isomorphous replacement. Heavy atom sites were located by Patterson methods and confirmed using cross‐phased difference maps. Refinement of heavy atom parameters and phase calculation were performed with SHARP (de la Fortelle and Bricogne, 1997). Initial phases were calculated at 3.0 Å (Table I), and were significantly improved using the density modification procedures in SOLOMON. An electron density map calculated using these phases showed clear molecular boundaries and allowed the identification of several secondary structure elements to which polyalanine chains were fitted. These elements and the positions of the heavy atom substitutions were used to determine the position of the local 2‐fold axis relating the two molecules in the asymmetric unit. Masks covering the monomer were created from skeletonized electron density and partial models using MAMA (Kleywegt and Jones, 1994), edited interactively with program O (Jones et al., 1991), and used for local averaging and solvent flattening in further cycles of density modification performed with DM (Cowtan, 1994). The final figure of merit for data in the range 25.0–3.0 Å was 0.68. The resulting electron density map was used for model building with program O. The initial model comprised 138 amino acid residues for each of the two copies in the asymmetric unit and had a crystallographic R‐factor of 0.47 for all reflections in the resolution range 10.0–3.0 Å.

Initial phases for the Lb^pro (residues 29–201) crystal form were obtained by molecular replacement with the sLb^pro coordinates as a model. Rotation/translation parameters were calculated with 95% of data between 15 and 3.5 Å using the AMoRe package (Navaza, 1994), and the information about the presence of a translational symmetry derived from the strong peak found in the native Patterson at position u = 0.5, v = 0.5, w = 0.225 (Navaza et al., 1998). Four independent solutions were found that generate the second four by translation. The molecular replacement solution gave a final correlation factor of 0.69 and an R‐factor of 0.40 for data in the same resolution range.

Refinement

Refinement was done following standard protocols using iteratively program X‐PLOR (Brünger, 1992) alternating with manual rebuilding in the interactive graphics program O. For both crystal forms, bulk solvent, overall anisotropic B‐factor corrections and tight non‐crystallographic restraints were introduced based on the behaviour of the R_free index (Table I). The non‐crystallographic restraints for the Lb crystal form were applied considering two groups of 4‐fold symmetrically related molecules.

The refined atomic model for the sLb^pro form comprises residues 29–187 for both copies in the asymmetric unit and 20 ordered solvent molecules, and has an R‐factor of 23.5% (R_free = 28.8%) for all data between 20.0 and 3.0 Å. Of the non‐glycine residues, 82.1% fall within the ‘most favoured regions’ of the Ramachandran plot, on the higher side of acceptable for a 3.0 Å structure as defined by the program PROCHECK (Laskowski et al., 1994). The rest are inside the ‘additional allowed regions’, except for Asp164 which presents good electron density and is located at position i + 1 of a type II′ β turn. No electron density is observed for residues 184–187, which form part of the CTE and project away from the globular catalytic domain. The remainder of the residues, apart from a few solvent‐exposed side chains, present well‐defined electron density for both molecules in the asymmetric unit.

The refined atomic model for the Lb^pro form comprises residues 29–201 for four copies in the asymmetric unit while the other four do not include residues 186–192. The N‐terminal residue appears to be modified by a well‐defined acetyl group in the eight independent subunits. The present model has an R‐factor of 25.8% (R_free = 29.6%) for all data between 20.0 and 3.0 Å. No electron density is observed for residues 186–192 in copies with the disordered conformation of the CTE; however, the remaining residues, apart from a few solvent‐exposed side chains, present well‐defined electron density for all molecules in the asymmetric unit.

Acknowledgements

We thank F.Torrents and A.Marina for assistance, J.Navaza for his valuable advice, and J.Bravo, F.X.Gomis‐Rüth and J.Seipelt for critical reading of the manuscript. This work was supported by the Austrian Science Foundation (P‐11222 to T.S.) and DGICYT [PB95‐0218 (to I.F.) and P96‐0271 (to J.T.)]. Data collection in Hamburg was supported by the Human Capital Mobility Project, contract CHGE‐CT93‐0040. A.G. is the recipient of a fellowship from the Ministerio de Educacion y Cultura (Spain).

References

Allaire M, Chernaia MM, Malcolm BA and James MNG (1994) Picornaviral 3C cysteine proteases have a fold similar to chymotrypsin‐like serine proteases. Nature, 369, 72–76.

Google Scholar

Babé LM and Craik CS (1997) Viral proteases: evolution of diverse structural motifs to optimize function. Cell, 91, 427–430.

Google Scholar

Belsham GJ, Brangwyn JK, Ryan MD, Abrams CC and King AMQ (1990) Intracellular expression and processing of foot‐and‐mouth disease virus capsid precursors using vaccinia virus vectors: influence of the L protease. Virology, 176, 524–530.

Google Scholar

Berti PJ and Storer AC (1995) Alignment/phylogeny of the papain superfamily of cysteine proteases. J Mol Biol, 246, 273–283.

Google Scholar

Brocklehurst K, Watts AB, Patel M, Verma C and Thomas EW (1998) Cysteine proteinases. In Sinnott,M. (ed.), Comprehensive Biological Catalysis. Vol I. Academic Press, San Diego, CA, pp. 381–423.

Google Scholar

Brünger AT (1992) XPLOR Version 3.1: A System for X‐Ray Crystallography and NMR. Yale University Press, New Haven, CT.

Google Scholar

Chen P et al. (1996) Structure of the human cytomegalovirus protease catalytic domain reveals a novel serine protease fold and catalytic triad. Cell, 86, 835–843.

Google Scholar

Collaborative Computational Project, Number 4 (1994) The CCP4 Suite: Programs for Protein Crystallography. Acta Crystallogr, D50, 760–763.

Google Scholar

Coulombe R, Grochulski P, Sivaraman J, Ménard R, Mort JS and Cygler M (1996) Structure of human procathepsin L reveals the molecular basis of inhibition by the prosegment. EMBO J, 15, 5492–5503.

Crossref

Google Scholar

Cowtan K (1994) DM, an automated procedure for phase improvement by density modification. Joint CCP4 and ESF‐EACBM Newslett. Protein Crystallogr, 31, 24–28.

Google Scholar

de la Fortelle E and Bricogne G (1997) Maximum‐likelihood heavy‐atom parameter refinement in the MIR and MAD methods. Methods Enzymol, 276, 472–494.

Google Scholar

Ding J, McGrath WJ, Sweet RM and Mangel WF (1996) Crystal structure of the human adenovirus protease with its 11 amino acid cofactor. EMBO J, 15, 1778–1783.

Crossref

Google Scholar

Esnouf RM (1997) An extensively modified version of Molscript that includes greatly enhanced colouring capabilities. J Mol Graph, 15, 133–138.

Google Scholar

Gorbalenya AE, Koonin EV and Lai MM‐C (1991) Putative papain‐related thiol proteases of positive‐strand RNA viruses. FEBS Lett, 288, 201–205.

Google Scholar

Guarné A, Kirchweger R, Verdaguer N, Liebig HD, Blaas D, Skern T and Fita I (1996) Crystallization and preliminary X‐ray diffraction studies of the Lb protease from foot‐and‐mouth disease virus. Protein Sci, 5, 1931–1933.

Google Scholar

Jones TA, Zou JY, Cowan SW and Kjeldgaard M (1991) Improved methods for building protein models in electron density maps and the location of errors in this model. Acta Crystallogr, A47, 110–119.

Google Scholar

Kamphuis IG, Kalk KH, Swarte MBA and Drenth J (1984) Structure of papain refined at 1.65 Å resolution. J Mol Biol, 179, 233–256.

Google Scholar

Kamphuis IG, Drenth J and Baker EN (1985) Thiol proteases: comparative studies based on the high‐resolution structures of papain and actinidin, and on amino acid sequence information from cathepsins B and H, and stem bromelain. J Mol Biol, 182, 317–329.

Google Scholar

Kim JL et al. (1996) Crystal structure of the hepatitis C virus NS3 protease domain complexed with a synthetic NS4A cofactor peptide. Cell, 87, 345–355.

Google Scholar

Kirchweger R et al. (1994) Foot‐and‐mouth disease virus leader protease: purification of the Lb form and determination of its cleavage site on eIF‐4γ. J Virol, 68, 5677–5684.

Google Scholar

Kleywegt GJ and Jones TA (1994) Halloween masks and bones. In Bailey,S., Hubbard,R. and Waller,D. (eds), From the First Map to Final Model SERC Daresbury Laboratory, Warrington, UK, pp. 59–66.

Google Scholar

Kraulis PJ (1991) MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. J Appl Crystallogr, 24, 946–950.

Crossref

ISI

Google Scholar

Laskowski RA, MacArthur MW, Smith DK, Jones DT, Hutchinson EG, Morris AL, Naylor D, Moss D and Thornton JM (1994) PROCHECK Manual Version 3.0 Oxford Molecular Ltd, Oxford, UK.

Google Scholar

Love RA, Parge HE, Wickersham JA, Hostomsky Z, Habuka N, Moomaw EW, Adachi T and Hostomska S (1996) The crystal structure of hepatitis C virus NS3 protease reveals a trypsin‐like fold and a structural zinc binding site. Cell, 87, 331–342.

Google Scholar

Matthews DA et al. (1994) Structure of human rhinovirus 3C protease reveals a trypsin‐like polypeptide fold, RNA‐binding site, and means for cleaving precursor polyprotein. Cell, 77, 761–771.

Google Scholar

Medina M, Domingo E, Brangwyn JK and Belsham GJ (1993) The two species of the foot‐and‐mouth disease virus leader protein, expressed individually, exhibit the same activities. Virology, 194, 355–359.

Google Scholar

Merrit EA and Murphy MEP (1994) Raster3D version 2.0. A program for photorealistic molecular graphics. Acta Crystallogr, D50, 869–873.

Google Scholar

Navaza J (1994) AMoRe: an automated package for molecular replacement. Acta Crystallogr, A50, 157–163.

Google Scholar

Navaza J, Panepucci EH and Martin C (1998) On the use of strong Patterson function signals in many‐body molecular replacement. Acta Crystallogr, D54, 817–812.

Google Scholar

Nicholls A, Sharp K and Honig B (1991) Protein folding and association: insights from the interfacial and thermodynamic properties of hydrocarbons. Proteins, 11, 281–296.

Crossref

PubMed

ISI

Google Scholar

Otwinowski Z and Minor W (1997) Processing of X‐ray diffraction data collected in oscillation mode. Methods Enzymol, 276, 307–326.

Crossref

PubMed

ISI

Google Scholar

Piccione ME, Zellner M, Kumosinski TF, Mason PW and Grubman MJ (1995a) Identification of the active‐site of the L protease of foot‐and‐mouth disease virus. J Virol, 69, 4950–4956.

Google Scholar

Piccione ME, Sira S, Zellner M and Grubman MJ (1995b) Expression in Escherichia coli and purification of biologically active L proteinase of foot‐and‐mouth disease virus. Virus Res, 35, 263–275.

Google Scholar

Roberts PJ and Belsham G (1995) Identification of critical amino acids with the foot‐and‐mouth disease Leader protein, a cysteine protease. Virology, 213, 140–146.

Google Scholar

Ryan MD and Flint M (1997) Virus‐encoded proteases of the picornavirus super‐group. J Gen Virol, 78, 699–723.

Google Scholar

Skern T, Fita I and Guarné A (1998) A structural model of picornavirus leader proteases based on papain and bleomycin hydrolase. J Gen Virol, 79, 301–307.

Google Scholar

Stuart DI, Levine M, Muirhead H and Stammers DK (1979) The crystal structure of a cat pyruvate kinase at a resolution of 2.6 Å. J Mol Biol, 134, 109–142.

Google Scholar

Tong L, Qian C, Massariol M‐J, Bonneau PR, Cordingley MG and Lagacé L (1996) A new serine‐protease fold revealed by the crystal structure of human cytomegalovirus protease. Nature, 383, 272–282.

Google Scholar

Yamamoto D, Matsumoto K, Ohishi H, Ishida T, Inoue M, Kitamura K and Mizuno H (1991) Refined X‐ray structure of papain·E‐64‐c complex at 2.1 Å resolution. J Biol Chem, 266, 14771–14777.

Google Scholar

Yamamoto A et al. (1992) Crystal structure of papain‐succinyl‐Gln‐Val‐Val‐Ala‐Ala‐p‐nitroanilide complex at 1.7 Å resolution: noncovalent binding mode of a common sequence of endogenous thiol protease inhibitors. Biochemistry, 31, 11305–11309.

Google Scholar

Ziegler E, Borman AM, Kirchweger R, Skern T and Kean KM (1995) Foot‐and‐mouth disease virus Lb proteinase can stimulate rhinovirus and enterovirus IRES‐driven translation and cleave several proteins of cellular and viral origin. J Virol, 69, 3465–3474.

Google Scholar

Information & Authors

Information

Published In

The EMBO Journal

Vol. 17 | No. 24
15 December 1998
Table of contents

Pages: 7469 - 7479

Article versions

Submission history

Received: 13 August 1998

Revision received: 14 October 1998

Accepted: 15 October 1998

Published online: 15 December 1998

Published in issue: 15 December 1998

Permissions

Request permissions for this article.

Request Permissions

Keywords

Copyright

Authors

Affiliations

Alba Guarné

Centre d'Investigació i Desenvolupament (CSIC), Jordi Girona Salgado 18–26 E‐08034 Barcelona Spain

View all articles by this author

José Tormo

Centre d'Investigació i Desenvolupament (CSIC), Jordi Girona Salgado 18–26 E‐08034 Barcelona Spain

View all articles by this author

Regina Kirchweger

Institute of Biochemistry, Medical Faculty, University of Vienna, Dr Bohr‐Gasse 9/3 A‐1030 Vienna Austria

Present address: Department of Cell Biology and Genetics, Sloan Kettering Memorial Cancer Center East 68th Street New York NY USA

View all articles by this author

Doris Pfistermueller

Institute of Biochemistry, Medical Faculty, University of Vienna, Dr Bohr‐Gasse 9/3 A‐1030 Vienna Austria

View all articles by this author

Ignasi Fita

Centre d'Investigació i Desenvolupament (CSIC), Jordi Girona Salgado 18–26 E‐08034 Barcelona Spain

View all articles by this author

Tim Skern^* [email protected]

Institute of Biochemistry, Medical Faculty, University of Vienna, Dr Bohr‐Gasse 9/3 A‐1030 Vienna Austria

View all articles by this author

Notes

Corresponding author. E-mail: [email protected]

Metrics & Citations

Metrics

Citations

Download Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Select your manager software from the list below and click Download.

Create a new account

Forgot your password?

Request Username

Introduction

Results and discussion

Structure determination

Description of the overall Lbpro structure

The active site cleft

The interaction between Lpro and its C‐terminus

Biological implications of the structure

Summary

Materials and methods

Protein expression, purification and crystallization

Data collection

Structure determination

Refinement

Acknowledgements

References

Information

Published In

Article versions

Submission history

Permissions

Keywords

Copyright

Authors

Affiliations

Notes

Metrics

Citations

Download Citations

Citing Literature

View options

PDF

Get Access

Figures

Other

Share

Copy the content Link

Share on social media

Journal

For Authors

Information

Stay Connected

EMBO

Description of the overall Lb^pro structure

The interaction between L^pro and its C‐terminus