<iframe src="https://www.googletagmanager.com/ns.html?id=GTM-KCV32QR" height="0" width="0" style="display:none;visibility:hidden">

Denaturant-dependent folding of GFP

Edited by Peter G. Wolynes, Rice University, Houston, TX, and approved June 12, 2012 (received for review March 31, 2012)
July 9, 2012
109 (44) 17832-17838

Abstract

We use molecular simulations using a coarse-grained model to map the folding landscape of Green Fluorescent Protein (GFP), which is extensively used as a marker in cell biology and biotechnology. Thermal and Guanidinium chloride (GdmCl) induced unfolding of a variant of GFP, without the chromophore, occurs in an apparent two-state manner. The calculated midpoint of the equilibrium folding in GdmCl, taken into account using the Molecular Transfer Model (MTM), is in excellent agreement with the experiments. The melting temperatures decrease linearly as the concentrations of GdmCl and urea are increased. The structural features of rarely populated equilibrium intermediates, visible only in free energy profiles projected along a few order parameters, are remarkably similar to those identified in a number of ensemble experiments in GFP with the chromophore. The excellent agreement between simulations and experiments show that the equilibrium intermediates are stabilized by the chromophore. Folding kinetics, upon temperature quench, show that GFP first collapses and populates an ensemble of compact structures. Despite the seeming simplicity of the equilibrium folding, flux to the native state flows through multiple channels and can be described by the kinetic partitioning mechanism. Detailed analysis of the folding trajectories show that both equilibrium and several kinetic intermediates, including misfolded structures, are sampled during folding. Interestingly, the intermediates characterized in the simulations coincide with those identified in single molecule pulling experiments. Our predictions, amenable to experimental tests, show that MTM is a practical way to simulate the effect of denaturants on the folding of large proteins.
Our understanding of the folding mechanisms of small single domain proteins has advanced significantly over the last twenty years thanks to theoretical developments (17), simulations (813), and advances in experimental methods (1419). Indeed, it can be justly stated that simulations of structure-based models are remarkably successful in predicting the folding mechanisms in the presence (2022) and absence of denaturants (2325). More refined questions, such as the relationship between pathway diversity and the symmetry of the underlying native structures (26), transition path times for crossing free energy barriers (27), the nature of the transition state ensemble (2831) have also been addressed using theory and experiments. In contrast, much less is known about how large proteins with number, N, of amino acids exceeding ≈200 with complex β-sheet fold.
Folding mechanisms of Green Fluorescent Protein (GFP), with predominantly β-sheet native structures stabilized by contacts between residues that are well separated along the sequence, have been extensively investigated (3240). However, the details of the folding kinetics including the structures in the states that are visited along the folding routes are unexplored. It has been difficult to describe quantitatively the folding of GFP because of the extraordinarily long equilibration times, which has made it non-trivial to obtain consistent values of even thermodynamic quantities using different experimental probes. For example, the number of intermediates and their stabilities, and the associated denaturant-dependent properties (m-values) are not unambiguously known for many of the variants of GFP. Because of the presence of a visual chromophore, that is shielded from water by the barrel structure (Fig. 1A), GFP has been used as a marker in a number of biotechnology and biophysical applications ranging from protein-protein interactions to gene expression (41). Consequently, it is important to map in detail the folding landscapes of FPs (32), which could aid in the development of new GFP tools for use in in vivo localization studies.
Fig. 1.
GFP structure and thermal folding. (A) Ribbon diagram of Citrine (Yellow version of GFP, PDB ID: 1HUY). Residues marked in orange, green and mauve are Lys3, Asn198, and Asn212, respectively. (B) Splay representation of GFP. The N-terminus β-strands are represented in blue, the kinked helix at the center of the β-strand barrel is in green, the three β-strands in the center, which form local contacts are in silver, and the C-terminus β-strands are in red. (C) Fraction of GFP in the NBA, INT, and UBA as a function of temperature, T, are shown in triangles, diamonds, and circles respectively. The left inset shows fINT(T) as a function of T. The right inset shows energy, E, and the heat capacity, Cv as a function of T. (D) Free energy of GFP as a function of E for temperatures above and below the melting temperature (T = 379 K). The structure of one of the intermediates is shown in splay representation.
The folded structure of GFP has eleven β-strands that are closed to form the characteristic barrel structure (Figs. 1A and 1B). The chromophore, surrounded by the barrel and the capping helices, causes hysteresis (denaturant-induced equilibrium folding and unfolding do not coincide even on time scales that exceed hours) in the folding of a number of variants of GFP (34, 35). Mutants that inhibit chromophore formation fold without hysteresis, and could form the basis of describing the de novo folding of the complex barrel structure.
Here, we use simulations of a coarse-grained (CG) Self-Organized Polymer model (42) with side chains (SOP-SC) (21) of GFP, with N = 230. Our simulations are performed for a chromophore less variant of GFP, Citrine, which differs from the wild type sequence by point mutations at four locations, S65G/V68L/Q69M/S72A/T203Y (36). Thermal denaturation and unfolding induced by Guanidinium chloride (GdmCl) (simulated using the Molecular Transfer Model (MTM)) (20) show that equilibrium folding can be approximately described by a two-state model. Folding trajectories probing equilibrium fluctuations and free energy profiles point to the presence of intermediates. We show, in agreement with experiments (33, 38), using a variety of probes of the structures (distributions of the radius of gyration, Rg, solvent accessible surface area, and structural overlap function, χ) that in the intermediate, the N-terminus is folded whereas the C-terminus region comprising of strands β7 to β10 are flexible and disordered. Upon initiating folding the time-dependent decrease in , obtained using Brownian dynamics simulations (21, 43), shows that compaction occurs in at least two major stages. In accord with the predictions of the kinetic partitioning mechanism(KPM) (44) Citrine folds by at least four distinct routes. In the dominant pathway, folding commences from preformed local β-sheet structures and occurs in an almost all-or-none manner. Stable local structures associate by diffusion-collision mechanism (45) resulting in the folded state. Besides providing testable predictions, such as the linear decrease in the melting temperature as a function of GdmCl and urea concentrations, our work shows that the complexity of GFP folding can be fairly accurately captured using the SOP-SC model in conjunction with MTM.

Results

Thermal Denaturation:

We used multicanonical (MC) molecular dynamics simulations (46, 47) so that thermodynamic properties of Citrine can be computed accurately (see SI Text, and Movie S1 for details). Dependence of the total energy as a function of temperature, T, and the associated heat capacity both indicate that folding occurs cooperatively (inset in Fig. 1C). The melting temperature, corresponding to the peak in Cv(T) in the absence of denaturants is, Tm = 379 K (see right inset in Fig. 1C). The structural overlap function, χ, is used to distinguish between the Native Basin of Attraction (NBA), equilibrium intermediates (IEQL), and UBA, the unfolded basin of attraction (Methods). The fraction of molecules in the Native Basin of Attraction, fNBA(T), as a function of T shows that the folding transition is cooperative (Fig. 1C) suggesting that a two-state description is adequate. The value of Tm computed using P(Tm) = 0.5 yields Tm = 379 K, which coincides with the peak in the heat capacity (Fig. 1C). The Tm value obtained in our simulations are in very good agreement with calorimetry measurements (48, 49), thus validating the SOP-SC model.
The two state nature of thermal unfolding obtained from Fig. 1C hides the presence of equilibrium intermediates, which are evident in the MC folding trajectory (see Fig. S1 in the SI Text). The time dependent changes in E and χ in Fig. S1 show that at least two equilibrium states (hereafter referred collectively as IEQL) are populated. The dips in the plots of free energy profiles, F(E) as a function of E, above and below Tm correspond to population of intermediates with χ values in the range specified in Methods (Fig. 1D). The inset in Fig. 1C shows that fINT(T) is very small, which explains the apparent two-state folding of Citrine in the absence of the chromophore (see also (34)). The finding that IEQL are populated during the folding process (see also Fig. 2) provides indirect support to the analysis used in experimental studies, which found that a three-state description provides a quantitative fit to denaturant unfolding of Citrine in the presence of chromophore. Thus, the intermediate is, in all likelihood, stabilized by the chromophore.
Fig. 2.
Denaturant effects on GFP: (A) Dependence of fα([C]) (α = NBA, INT, and UBA) as a function of GdmCl concentration. Blue triangles, black diamond, and red circles are used for fNBA([C]), fINT([C]), and fUBA([C]), respectively. The experimental data for fraction of protein in UBA as a function of [C] is shown in green squares. The right inset shows the low probability of populating the intermediate species. The left inset shows the linear variation of free energy difference between the NBA and UBA, ΔGNU, as a function of denaturant concentrations. The data in red and green are for GdmCl and Urea, respectively. (B) Free Energy profiles, F(Rg), of GFP as a function of Rg show the population of a thermodynamic intermediate corresponding to the shaded green area. (C) Heat capacity as a function of GdmCl. (D) Variations in the melting temperature, corresponding to the peak in Cv(T), as a function of denaturant concentrations. The upper curve is for urea and the lower one corresponds to GdmCl.

Denaturant Induced Unfolding:

To make a direct comparison with ensemble experiments (33, 34) we simulated the effects of Guanidinium chloride (GdmCl) using the MTM (see Methods and SI Text for details). Following our previous work (21), we choose a simulation temperature, Ts, at which the free energy differences between the NBA and the unfolded state, ΔGNU( = GN(Ts) - GU(Ts)) calculated from simulations agree with the measured value of for R96A mutant (34) at denaturant concentration, [C] = 0. We chose R96A as a reference GFP for fixing Ts because it exhibits a clear equilibrium two-state transition (34). Since the absolute values of the effective interaction energies in CG models cannot be determined, we used Ts as an adjustable parameter to get accurate estimate of ΔGNU. The choice of Ts, which is the only parameter that is fixed in our simulations, amounts to choosing the overall free energy scale (21). The dependence of fNBA[C] and fUBA[C] on [C] at Ts = 368.2 K also shows that Citrine folds and unfolds reversibly in an apparent two state manner(Fig. 2A). However, the right inset in Fig. 2A shows signs for the presence of an intermediate.
The midpoint of the folding transition f([Cm]) = 0.5 yields Cm = 1.3 M. The simulated equilibrium titration curves can be fit using a two state model from which the dependence of ΔGNU([C]) on [C] can be calculated using . The linear fit, , yields the apparent stability at [C] = 0, and mGdmCl ≈ 11.3 kcal/mole.M. We attribute the difference between and to well established uncertainties in the measured transfer energies at low GdmCl concentration. For comparison we also show the experimental results on R96A. The left inset in Fig. 2A, which shows the dependence of ΔGNU([C]) on [C] using the experimental data on R96A yields and Cm ≈ 1.3 M. The m and Cm values obtained from simulations are in reasonable agreement with experiments.
To ensure that the source of discrepancy between and is due to the problems associated with transfer free energy measurements involving GdmCl, we calculated the dependence of ΔGNU([C]) on [C] for urea induced unfolding using the MTM theory (20). The left inset in Fig. 2A shows that ΔGNU([C]) also decreases linearly as urea [C] increases. The calculated value of GNU([0]) ≈ -16.4 kcal/mole is in excellent agreement with experiment (34). The MTM based theory predicts that murea ≈ 5.7 kcal/mole.M, which is considerably smaller than mGdmCl. The predictions for urea induced unfolding of Citrine can be tested experimentally.
The apparent all or none transition in Fig. 2A gives no indication of the presence of IEQL (see also Fig. 1C). However, the free energy profiles, F(Rg), as a function of Rg (Fig. 2B) shows that there are two high energy intermediates at all values of [C]. As is the case in thermal denaturation, the probability of observing these states is extremely small even though the signature of their presence is evident in the folding trajectories (Fig. S1 in SI Text). The free energy profiles projected along E and Rg both are similar implying that these states are structurally similar.
The heat capacity curves at various values of GdmCl concentrations show that the peaks corresponding to Tm([C]) decreases as [C] increases (Fig. 2C). The decrease in Tm([C]) is linear for both GdmCl and urea (Fig. 2D). The variations in Tm([C]) can be fit using Tm([C]) = Tm([0]) - αi[C], where αurea = 3.85 K/M and αGdmCl = 5.7 K/M. Taken together, the simulations show that the global equilibrium folding of Citrine, without the chromophore, induced by temperature or denaturants is cooperative.

Structural Characteristics of IEQL and Unfolded States:

From the multicanonical trajectories (Fig. S1) and the free energy profiles (Fig. 1D and Fig. 2B) we infer that there are two equilibrium intermediates. In one of the intermediates, interactions involving predominantly the C-terminal strands (β7 - β10) are disrupted. Almost all of the contacts in the C-terminal strands are broken in the second intermediate (Fig. 3A). In both the the intermediates, the N-terminal β strands are formed and undergo small fluctuations (Fig. 3A). Interestingly, H/D exchange experiments show that these regions have the largest exchange rates suggesting that in IEQL the C-terminal region is flexible (33), whereas the N-terminal strands are more-or-less intact. Thus, the structural attributes of the equilibrium intermediates in our simulations are similar to those identified in experiments.
Fig. 3.
(A) Distribution of PR) where ΔR is the relative accessible surface area (see text for details) in one of the kinetic intermediates with attributes similar to that found under equilibrium. The typical structure sampled at equilibrium is on the left whereas the structure on the right represents a conformation obtained in the kinetic simulation in the EQL pathway. (B) Distribution, P(r), as a function of r, the distance between a pair of sites for NBA (triangles), INT (diamonds), and UBA (circles) computed using conformations sampled at equilibrium with [C] = 0. (C) Distribution of Rg as a function of GdmCl concentration. The midpoint of the transition is Cm = 1.3 M. The inset shows P(Rg) at large Rg.
To compare with the SAXS experiments, we also calculated the pair distance distribution P(r) for the folded, IEQL, and the unfolded states by varying the GdmCl concentration (Fig. 3B). At low [C], we find that P(r) is peaked around 20 Å. The mean ≈20  compares well with the value obtained using where and are the positions of the ith and jth beads respectively from the X-ray structure. The P(r) for IEQL is broader with a shoulder around 50 Å, which corresponds to structures with complete disruption of the C-terminal structure. The value of for IEQL ≈ 43.8 . The P(r) for the unfolded state at high GdmCl concentration is broad with a tail that extends to ≈200 . The calculated value of for the unfolded state is . The features found in P(r) for the three states are in broad agreement with SAXS experiments (38) on a different variant of GFP in which unfolding is triggered by decreasing pH. However, the values of and obtained from simulations are considerably higher than the estimates from SAXS at pH = 4 and pH = 2.2, respectively. It is unclear if the greater compaction of IEQL and the unfolded state at acidic pH compared to simulations is due to the different conditions (pH versus temperature), the restricted range of scattering vectors (38) or due to the short comings of the SOP-SC model. The distribution P(Rg) of Rg shows that as [C] increases GFP samples extended conformations with Rg exceeding 80° (Fig. 3C).

Collapse and Folding Kinetics:

We used Brownian dynamics simulations to generate 80 folding trajectories starting from an initial ensemble of equilibrated structures at a high temperature (TH > 420 K) and reducing the temperature to T = 300 K to initiate folding. From the distribution of first passage times, PFP(s), we calculated the probability, , that Citrine has not folded at time, t (inset in Fig. 4A). The folding time obtained from the exponential decay of PU(t) ≈ exp(-tF) is τF ≈ 5 ms. Although there is a wide range for τF reported in experiments the smallest value for τF for Citrine from recent experiments is ≈1 s (36). Thus, our estimate of τF is at least a hundred fold less than experimental values, which could arise from neglect of solvent and the coarse-grained representation of the proteins.
Fig. 4.
GFP folding kinetics. (A) Time (t) dependent decrease in . The solid line in green is a two exponential fit to the data. The inset shows the fraction of trajectories unfolded, Pu(t) as function of t, with the solid black line being an exponential fit. (B) Total energy E as a function of t for folding along four pathways. A representative trajectory for each pathway is shown. (C) Dynamic profiles generated using the folding trajectories in terms of χ and Rg for KIN1 and KIN2 in red and black respectively. The ribbon diagram of the kinetic intermediate in KIN2 shows that β1 - β3 sheets are not packed onto the rest of the GFP barrel structure. (D) Plot of χ as function of Rg for folding pathways KIN3 and EQL are in green and blue, respectively. Interactions involving the helix and β1 - β3 in the first intermediate sampled in KIN3 with the rest of the barrel are absent. The second intermediate in KIN3 is similar to the one observed for KIN2. In the intermediate populated in the EQL pathway, the C-terminal β-sheets do not interact with the ordered N-terminal strands β1 - β6.
The time-dependent changes in during the folding process shows that collapse occurs in two stages (Fig. 4A). The decay of can be fit using , where a1 = (23.4 ± 0.2) , a2 = (22.5 ± 0.2) , τ1 = 3.6 ms and τ2 = 43.47 ms (Fig. 4A). The decrease in the value of Rg(t) in each stage is roughly the same (≈22 ). However, there is a separation in the time scales in the two stages implying that distinctly different ensembles of structures are sampled in two stages. The structures reached at the end of the first stage correspond to minimum energy compact structures (50) that guide the folding of Citrine. The predicted multistage compaction of GFP, similar to that found in other proteins (51), can be tested using time-resolved SAXS experiments.

Parallel Pathways and Kinetic Intermediates:

By analyzing the 80 folding trajectories using E (see Methods) as the progress variable for the folding reaction (Fig. S2) we find that Citrine folds along four pathways. For reasons explained below, we classify the structures populated in three of the pathways as kinetic intermediates and the intermediate in the fourth has most of the hallmarks of IEQL. Representative trajectories one from each pathway, displaying energy as a function of t, show that in each pathway folding occurs in stages (Fig. 4B). The nature of structures and their lifetimes vary greatly depending on the pathway. In the dominant pathway KIN1, through which ≈50% of the flux to the native state is channeled, Citrine folds in nearly a two state manner (Fig. 4B). In this pathway, β-strands at the N-terminus, the core of the protein, and the C-terminal strands form rapidly in a single step. The folding units in the long lived meta-stable states collide and consolidate (see Movie S1 in the SI Text) as envisioned in the diffusion-collision model (45).
In the second kinetic pathway, KIN2, representing ≈16% of the trajectories, folding occurs through an intermediate in which tertiary contacts between strands β3 and β11 and the interface between β1 and β6 are not formed (Fig. 4C). The structural properties of the intermediate in KIN2, such as the distributions of Rg and χ, differ from IEQL, and hence we conclude that it is observed only during the process of folding. Similarly, in the third kinetic pathway, KIN3, through which about 10% of the flux to the native state flows, the intermediates shown in Fig. 4D are observed. The intermediate shown in Fig. 4D, has a kinked helix interacting with the N-terminal β-sheets. The splay diagram shows further that long-range interactions involving β3 and β11 as well as β1 and β6 are disrupted.

Equilibrium Folding Intermediate:

In about 15% of the trajectories an intermediate whose structural characteristics almost coincide with IEQL is populated. The folding trajectory (Fig. 4B) and the [χ,Rg] plot in Fig. 4D show that the intermediate has a long lifetime. There are four lines for evidence which show that the kinetic and equilibrium intermediates are structurally similar. First, the obtained using the conformations sampled with folded N-terminal strands (the C-terminal strands are fluid-like) at equilibrium is ∼14  whereas those obtained from kinetic trajectories is ∼13 . Second, the typical conformations sampled at equilibrium and during the folding process are similar (compare the structures in Fig 3A). In both the structures the strands β1 through β6 are formed whereas the C-terminal is more flexible. Third, the values of χ for the conformations corresponding to the plateau in E as a function of t (purple line in Fig. 4B) is in the range 0.7 < χ ≤ 0.895, which coincides with the estimates obtained using equilibrium trajectories (see also the shaded green region in Fig. 2B). Fourth, we calculated the distribution, PR), of , where AU, AI and AN are the solvent accessible surface area (SASA) for the unfolded, IEQL, and the native states respectively using the conformations sampled during the kinetic simulations (plateau region in the purple curve in Fig. 4B). From PR) in Fig. 3A we find that the mean values , which compares well with the experimental value estimated for the equilibrium intermediate in the Citrine folding (36). Thus, we surmise that the structures sampled along the EQL pathway (Fig. 4B) are similar to those observed under equilibrium conditions.

Relation Between Collapse, Secondary Structure Formation and Folding:

It has been shown that for efficient folding, collapse and folding occur nearly simultaneously (52), where as for larger complex proteins collapse precedes by folding. In Fig. 4C and 4D we show the conformations sampled in all the pathways. Each conformation is represented by χ and Rg. We conclude from the results in Figs. 4C and 4D that in all the four pathways Citrine undergoes compaction (reduction in Rg) before folding (decrease in χ). These findings show that the search for the NBA occurs among the ensemble of minimum energy compact structures (50). To explore the link between secondary structure formation and collapse we plot the fraction, fss, of secondary structure acquired as the polypeptide chain collapses. Fig. S3 shows that in all the pathways the value of fss is relatively small even after substantial collapse, which also implies that collapse generates a fluid-like globule. Consolidation of structure occurs only after reduction in chain dimension.

Topological Traps:

In 6 out the 80 folding trajectories topological traps (44) give rise to Citrine misfolding. In five of the trajectories, Citrine is kinetically trapped in a native-like structure in which the loop connecting β9 and β10 (Fig. 5B) is in an incorrect position, thus disrupting the interactions between β-strands 4 and 9. In the other misfolded structure (35), the helix is outside the barrel and hence does not pack in the center as observed in the native structure.
Fig. 5.
Topological traps and model for GFP folding. (A) The wild type GFP structure in blue is superimposed onto the misfolded structure in red. Misfolding of the loop connecting β9 and β10 causes a topological trap. The arrows show the correct (folded) and incorrect (misfolded) conformation of the loop. (B) In the second misfolded structure the barrel involving the 11 β-strands form with no contact between the central α-helix and rest of the structure. (C) Folding landscape and network of connected states based on simulations. The flux through the four channels and the routes from the UBA to the NBA are indicated by arrows. Because topological traps (structures in (A) and (B)) are dead ends in the folding process they do not reach the NBA on the time scale of our simulations. Thus, the total flux to the NBA from UBA is less than 100%. Representative structures sampled along the four pathways as well as an unfolded conformation are shown.

Discussion

Intermediates in GFP Folding:

Compelling evidence for postulating the presence of equilibrium intermediates is the non-coincidence of free energies extracted using fluorescence and NMR measurements even though visually the decrease in fluorescence as a function of GdmCl suggests that a two-state description is sufficient (33). More recently, single molecule fluorescence experiments have presented clear evidence for three state equilibrium folding of Citrine with the chromophore (36). The presence of high energy equilibrium intermediates in R96A is only evident in free energy profiles computed as a function of energy (Fig. 1D), radius of gyration (Fig. 2B), and the fraction of native contacts (Fig. S1B). The structural characteristics of IEQL support the finding based on equilibrium H/D exchange experiments showing order in the N-terminus and flexibility in the C-terminal region spanning β7 to β10 (33). It is worth pointing out that one of the intermediates populated in single molecule pulling experiments corresponds to unfolding in the C-terminus domain with β1 through β6 remaining intact (53). Thus, both experiments and simulations point to the presence of IEQL. The observation that IEQL in our simulations is a high energy intermediate occurring with negligible probability whereas the ones characterized in experiments are stable suggests that the intermediates in GFP folding are stabilized by the chromophore.

Comparing Simulation and Experimental Folding Pathways:

We have found evidence for kinetic intermediates that direct folding to the native state. The range of kinetic intermediates observed in our simulations have previously been identified in single molecule pulling experiments and simulations (39, 40, 42, 53). When GFP is stretched by applying mechanical force, f, between residues 3 and 212 (Fig. 1), GFP unfolds along two routes. In the first one, unfolding is in an all-or-none process where as in the second pathway unfolding occurs by populating an intermediate (53). The increase in the contour length upon unfolding GFP from the folded to the intermediate state is consistent with formation of a structure in which strands (1–6) are intact and the rest are unfolded. Our simulations and previous ensemble experiments (33, 37, 38) have identified such an intermediate as occurring both in equilibrium and during the folding process.
When mechanical force is applied (53) to the residues 3 and 198, GFP unfolds by populating a different intermediate in which the 3 N-terminus β-strands (β1, β2, β3) have ruptured away from the rest of the barrel. In addition, when f is applied to the ends of the molecule the dominant unfolding pathway also occurs through such an intermediate (40, 42). We showed (42) that upon initiating folding by reducing the mechanical force from a high value to f = 0, a long-lived metastable intermediate similar to that observed in KIN2 and KIN3 is populated before the barrel structure forms. We find here that in the process of refolding, upon temperature quench, the intermediate with loss of interactions between strands (1–3) and rest of the barrel is populated in pathways KIN2 and KIN3.
Single molecule fluorescence experiments have also shown that Citrine unfolds by parallel pathways. In one of the pathways, the folded state is reached directly, where as an intermediate with a low FRET is populated in the other (36). It is likely, that these low FRET states are an ensemble of conformations containing a mixture of equilibrium and kinetic intermediates. Taken together it is clear that folding GFP must occur through a number of distinct intermediates, which become visible only by using different techniques. By combining all the results from simulations and experiments, we propose a model (Fig. 5B) for GFP folding that involves a complex network of connected states through which the flux to the native flows. The resulting multiple pathways include equilibrium and kinetic intermediates as well as misfolded structures. Detailed comparison between our simulations and experiments show that a number of different experimental probes are needed to quantitatively map the folding landscape of GFP, and presumably other proteins with complex topology.

Collapse and Folding:

Since it was first demonstrated theoretically that folding and collapse transitions are intimately linked for proteins that reach the native state efficiently (54, 55) there has been considerable interest in validating this concept. For wild type GFP it is unmistakable that folding is preceded by populating compact intermediate species. Both SAXS experiments (38) as well our simulations show that a compact state, with intermediate between the values in the folded and unfolded states, is present and is likely to be on-pathway to the NBA. The kinetic folding trajectories clearly show (Fig. 4 C and D) that GFP first collapses before the formation of secondary and tertiary interactions. Our finding is strikingly similar to the order in which structure is acquired in monellin with β-sheet topology (51) as well as in GFP upon force quench (42).

Methods

SOP-Side Chain (SOP-SC) Model:

In the SOP-SC model, which uses residue-dependent interaction between SCs (56), each residue is represented by two interaction centers one for the backbone atoms and the other corresponds to the side chain atoms. The SOP-SC model is constructed using the crystal structure of Citrine, an improved yellow version of GFP (57), with the Protein Data Bank PDB ID: 1HUY. Citrine has 5 mutations compared to the wild type GFP (S65G, V68L, Q69M, S72A, T203Y). The residues labeled 0 and 1A in 1HUY are deleted. In Citrine the chromophore, labeled residue 66 in 1HUY, involves residues GLY65, TYR66, and GLY67. In our simulations the chromophore is disabled by eliminating the chemical reaction involving the three chromophore residues. The functional form of SOP-SC model and the parameters of the CG force field are given in Table S1 in the SI Text.

Molecular Transfer Model (MTM):

The energy of transferring a specific protein conformation from water to the denaturant solution at concentration [C] is written as where the summation includes both backbone and side chain atoms, δgtr,i([C]) is the experimentally measured transfer free energy of i, αi is the solvent accessible surface area (SASA) of the bead i, αGly-i-Gly is the solvent accessible surface area of the bead in the tripeptide Gly - i - Gly. The radii of amino acid backbone and sidechain are given in Table S2. The transfer free energies δgtr,i([C]) for the backbone and side chains, taken from experiments (20, 22, 58), are listed in table S3 in ref. (21). The values for αGly-i-Gly are listed in table S4 in ref. (21). The thermodynamic properties of a protein at [C] ≠ 0 are obtained using the procedure described earlier (20, 22).

Data Analysis:

We use the structural overlap function (59) to distinguish between the various populated states. Here, M = 2N = 460 is the number of interaction centers in the SOP-SC representation of GFP, rij is the distance between the beads i and j with being the corresponding distance in the folded state, Θ is the Heaviside step function, and δ = 2 . The conformation with 0 < χ ≤ 0.7 are folded, an intermediate state has 0.7 < χ ≤ 0.895, and conformations with χ > 0.895 are unfolded (Fig. S1 in the SI Text).

ACKNOWLEDGMENTS.

We are pleased to acknowledge useful discussions with Sophie Jackson. This work was supported by a grant from the National Science Foundation through grant CHE 09-14033. ZL acknowledges financial support from the National Natural Science Foundation of China under the grant no. 11104015.

Supporting Information

Supporting Information (PDF)
Supporting Information
SM01.mpg

References

1
D Thirumalai, C Hyeon, RNA and protein folding: Common themes and variations. Biochemistry 44, 4957–4970 (2005).
2
J Onuchic, Z LutheySchulten, P Wolynes, Theory of protein folding: The energy landscape perspective. Annu Rev Phys Chem 48, 545–600 (1997).
3
E Shakhnovich, Protein folding thermodynamics and dynamics: Where physics, chemistry, and biology meet. Chem Rev 106, 1559–1588 (2006).
4
KA Dill, SB Ozkan, MS Shell, TR Weikl, The protein folding problem. Annu Rev Biophys 37, 289–316 (2008).
5
V Munoz, W Eaton, A simple model for calculating the kinetics of protein folding from three-dimensional structures. Proc Natl Acad Sci USA 96, 11311–11316 (1999).
6
J Kubelka, ER Henry, T Cellmer, J Hofrichter, WA Eaton, Chemical, physical, and theoretical kinetics of an ultrafast folding protein. Proc Natl Acad Sci USA 105, 18655–18662 (2008).
7
D Thirumalai, EP O’Brien, G Morrison, C Hyeon, Theoretical Perspectives on Protein Folding. Annu Rev Biophys 39, 159–183 (2010).
8
A Fersht, V Daggett, Protein folding and unfolding at atomic resolution. Cell 108, 573–582 (2002).
9
DE Shaw, et al., Atomic-Level Characterization of the Structural Dynamics of Proteins. Science 330, 341–346 (2010).
10
VA Voelz, VR Singh, WJ Wedemeyer, LJ Lapidus, VS Pande, Unfolded-state dynamics and structureof protein L characterized by simulation and experiment. J Am Chem Soc 132, 4702–4709 (2010).
11
C Hyeon, D Thirumalai, Capturing the essence of folding and functions of biomolecules using coarse-grained models. Nat Commun 2, 487 (2011).
12
PC Whitford, et al., An all-atom structure-based potential for proteins: Bridging minimal models with all-atom empirical forcefields. Proteins 75, 430–441 (2009).
13
Z Zhang, HS Chan, Competition between native topology and nonnative interactions in simple and complex folding kinetics of natural and designed proteins. Proc Natl Acad Sci USA 107, 2920–2925 (2010).
14
B Schuler, WA Eaton, Protein folding studied by single-molecule FRET. Curr Opin Struct Biol 18, 16–26 (2008).
15
AA Nickson, J Clarke, What lessons can be learned from studying the folding of homologous proteins? Methods 52, 38–50 (2010).
16
A Borgia, PM Williams, J Clarke, Single-molecule studies of protein folding. Annu Rev Biochem 77, 101–125 (2008).
17
AI Bartlett, SE Radford, An expanding arsenal of experimental methods yields an explosion of insights into protein folding mechanisms. Nat Struct Mol Biol 16, 582–588 (2009).
18
EA Shank, C Cecconi, JW Dill, S Marqusee, C Bustamante, The folding cooperativity of a protein is controlled by its chain topology. Nature 465, 637–U134 (2010).
19
JCM Gebhardt, T Bornschloegla, M Rief, Full distance-resolved folding energy landscape of one single protein molecule. Proc Natl Acad Sci USA 107, 2013–2018 (2010).
20
E O’Brien, G Ziv, G Haran, B Brooks, D Thirumalai, Effects of denaturants and osmolytes on proteins are accurately predicted by the molecular transfer model. Proc Natl Acad Sci USA 105, 13403–13408 (2008).
21
Z Liu, G Reddy, EP O’Brien, D Thirumalai, Collapse kinetics and chevron plots from simulations of denaturant-dependent folding of globular proteins. Proc Natl Acad Sci USA 108, 7787–7792 (2011).
22
E O’Brien, B Brooks, D Thirumalai, Molecular origin of constant m-values, denatured state collapse, and residue-dependent transition midpoints in globular proteins. Biochemistry 48, 3743–3754 (2009).
23
M Oliveberg, PG Wolynes, The experimental survey of protein-folding energy landscapes. Q Rev Biophys 38, 245–288 (2005).
24
D Klimov, D Thirumalai, Native topology determines force-induced unfolding pathways in globular proteins. Proc Natl Acad Sci USA 97, 7254–7259 (2000).
25
A Fernandez-Escamilla, et al., Solvation in protein folding analysis: Combination of theoretical and experimental approaches. Proc Natl Acad Sci USA 101, 2834–2839 (2004).
26
DK Klimov, D Thirumalai, Symmetric connectivity of secondary structure elements enhances the diversity of folding pathways. J Mol Biol 353, 1171–1186 (2005).
27
HS Chung, JM Louis, WA Eaton, Experimental determination of upper bound for transition path times in protein folding from single-molecule photon-by-photon trajectories. Proc Natl Acad Sci USA 106, 11837–11844 (2009).
28
D Klimov, D Thirumalai, Stiffness of the distal loop restricts the structural heterogeneity of the transition state ensemble in SH3 domains. J Mol Biol 317, 721–737 (2002).
29
F Ding, W Guo, N Dokholyan, E Shakhnovich, J Shea, Reconstruction of the src-SH3 protein domain transition state ensemble using multiscale molecular dynamics simulations. J Mol Biol 350, 1035–1050 (2005).
30
J Onuchic, N Socci, Z LutheySchulten, P Wolynes, Protein folding funnels: The nature of the transition state ensemble. Folding Des 1, 441–450 (1996).
31
Z Guo, D Thirumalai, The Nucleation-collapse mechanism in protein folding: Evidence for the non-uniqueness of the folding nucleus. Folding Des 2, 277–341 (1997).
32
STD Hsu, G Blaser, SE Jackson, The folding, stability and conformational dynamics of beta-barrel fluorescent proteins. Chem Soc Rev 38, 2951–2965 (2009).
33
J Huang, T Craggs, J Christodoulou, S Jackson, Stable intermediate states and high energy barriers in the unfolding of GFP. J Mol Biol 370, 356–371 (2007).
34
B Andrews, A Schoenfish, M Roy, G Waldo, P Jennings, The rough energy landscapeof Superfolder GFP is linked to chromophore. J Mol Biol 373, 476–490 (2007).
35
B Andrews, S Gosavi, J Finke, J Onuchic, P Jennings, The dual-basin landscape in GFP folding. Proc Natl Acad Sci USA 105, 12283–12288 (2008).
36
A Orte, T Craggs, S White, S Jackson, D Klenerman, Evidence of an intermediate and parallel pathways in protein unfolding from single-molecule fluorscence. J Am Chem Soc 130, 7898–7907 (2008).
37
S Enoki, K Saeki, K Maki, K Kuwajima, Acid denaturation and refolding of Green Fluorscent Protein. Biochemistry 43, 14238–14248 (2004).
38
S Enoki, et al., The equilibrium unfolding intermediate observed at pH 4 and its relationship with the kinetic folding intermediates in green fluorescent protein. J Mol Biol 361, 969–982 (2006).
39
H Dietz, M Rief, Exploring the energy landscape of GFP by single-molecule mechanical experiments. Proc Natl Acad Sci USA 101, 16192–16197 (2006).
40
M Mickler, et al., Revealing the bifurcation in the unfolding pathways of GFP by using single-molecule experiments and simulations. Proc Natl Acad Sci USA 104, 20268–20273 (2007).
41
R Tsien, The green fluorescent protein. Annu Rev Biochem 67, 509–544 (1998).
42
C Hyeon, R Dima, D Thirumalai, Pathways and kinetic barriers in mechanical unfolding and refolding of RNA and proteins. Structure 14, 1633–1645 (2006).
43
T Veitshans, D Klimov, D Thirumalai, Protein folding kinetics: timescales, pathways and energy landscapes in terms of sequence-dependent properties. Folding Des 2, 1–22 (1996).
44
Z Guo, D Thirumalai, Kinetics of protein-folding—nucleation mechanism, time scales, and pathways. Biopolymers 36, 83–102 (1995).
45
M Karplus, D Weaver, Protein-folding dynamics—the diffusion-collision model and experimental-data. Protein Sci 3, 650–668 (1994).
46
Y Okamoto, U Hansmann, Thermodynamics of helix-coil transitions studied by multicanonical algorithms. J Phys Chem 99, 11276–11287 (1995).
47
U Hansmann, Y Okamoto, F Eisenmenger, Molecular dynamics, Langevin and hybrid Monte Carlo simulations in a multicanonical ensemble. Chem Phys Lett 259, 321–330 (1996).
48
A Nagy, A Malnasi-Csizmadia, B Somogyi, D Lorinczy, Thermal stability of chemically denatured green fluorescent protein (GFP)—A preliminary study. Thermochim Acta 410, 161–163 (2004).
49
TN Melnik, TV Povarnitsyna, AS Glukhov, VN Uversky, BS Melnik, Sequential melting of two hydrophobic clusters within the green fluorescent protein GFP-cycle3. Biochemistry 50, 7735–7744 (2011).
50
C Camacho, D Thirumalai, Minimum energy compact structures of random sequences of heteropolymers. Phys Rev Lett 71, 2505–2508 (1993).
51
T Kimura, et al., Specific collapse followed by slow hydrogen-bond formation of beta-sheet in the folding of single-chain monellin. Proc Natl Acad Sci USA 102, 2748–2753 (2005).
52
D Klimov, D Thirumalai, Factors governing the foldability of proteins. Proteins 26, 411–441 (1996).
53
M Bertz, A Kunfermann, M Rief, Navigating the folding energy landscape of green fluorescent protein. Angew Chem Int 47, 8192–8195 (2008).
54
ZY Guo, D Thirumalai, JD Honeycutt, Folding kinetics of proteins—A model study. J Chem Phys 97, 525–535 (1992).
55
CJ Camacho, D Thirumalai, Kinetics and thermodynamics of folding in model proteins. Proc Natl Acad Sci USA 90, 6369–6372 (1993).
56
M Betancourt, D Thirumalai, Pair potentials for protein folding: choice of reference states and sensitivity of predicted native states to variations in the interaction schemes. Protein Sci 8, 361–369 (1999).
57
O Griesbeck, G Baird, R Campbell, D Zacharias, R Tsien, Reducing the environmental sensitivity of yellow fluorescent protein. mechanism and applications. J Biol Chem 276, 29188–29194 (2001).
58
M Auton, D Bolen, Additive transfer free energies of the peptide backbone unit that are independent of the model compound and the choice of concentration scale. Biochemistry 43, 1329–1342 (2004).
59
Z Guo, D Thirumalai, Kinetics and thermodynamics of folding of a de Novo designed four-helix bundle protein. J Mol Biol 263, 323–343 (1996).

Information & Authors

Information

Published in

Go to Proceedings of the National Academy of Sciences
Go to Proceedings of the National Academy of Sciences
Proceedings of the National Academy of Sciences
Vol. 109 | No. 44
October 30, 2012
PubMed: 22778437

Classifications

Submission history

Published online: July 9, 2012
Published in issue: October 30, 2012

Keywords

  1. complex folding landscape of GFP
  2. equilibrium and kinetic pathways
  3. multiple folding routes
  4. self-organized polymer model
  5. multicanonical simulation

Acknowledgments

We are pleased to acknowledge useful discussions with Sophie Jackson. This work was supported by a grant from the National Science Foundation through grant CHE 09-14033. ZL acknowledges financial support from the National Natural Science Foundation of China under the grant no. 11104015.

Notes

This article is a PNAS Direct Submission.

Authors

Affiliations

Govardhan Reddy
Biophysics Program, Institute for Physical Science and Technology and Department of Chemistry and Biochemistry, University of Maryland, College Park, MD 20742; and
Zhenxing Liu
Department of Physics, Beijing Normal University, Beijing 100875, China
D. Thirumalai1 [email protected]
Biophysics Program, Institute for Physical Science and Technology and Department of Chemistry and Biochemistry, University of Maryland, College Park, MD 20742; and

Notes

1
To whom correspondence should be addressed. E-mail: [email protected].
Author contributions: G.R. and D.T. designed research; G.R., Z.L., and D.T. performed research; G.R. and D.T. contributed new reagents/analytic tools; G.R. and D.T. analyzed data; and G.R. and D.T. wrote the paper.

Competing Interests

The authors declare no conflict of interest.

Metrics & Citations

Metrics

Note: The article usage is presented with a three- to four-day delay and will update daily once available. Due to ths delay, usage data will not appear immediately following publication. Citation information is sourced from Crossref Cited-by service.


Citation statements




Altmetrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

    Loading...

    View Options

    View options

    PDF format

    Download this article as a PDF file

    DOWNLOAD PDF

    Get Access

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Personal login Institutional Login

    Recommend to a librarian

    Recommend PNAS to a Librarian

    Purchase options

    Purchase this article to get full access to it.

    Single Article Purchase

    Denaturant-dependent folding of GFP
    Proceedings of the National Academy of Sciences
    • Vol. 109
    • No. 44
    • pp. 17725-18233

    Media

    Figures

    Tables

    Other

    Share

    Share

    Share article link

    Share on social media