An experimental survey of the transition between two-state and downhill protein folding scenarios
- Feng Liu‡,
- Deguo Du§,
- Amelia A. Fuller§,
- Jennifer E. Davoren¶,
- Peter Wipf¶,
- Jeffery W. Kelly§, and
- Martin Gruebele‡,‖,‡‡
- ‡Center for Biophysics and Computational Biology and
- ‖Departments of Chemistry and Physics, University of Illinois at Urbana–Champaign, Urbana, IL 61801;
- §Department of Chemistry and The Skaggs Institute for Chemical Biology, The Scripps Research Institute, 10550 North Torrey Pines Road BCC265, La Jolla, CA 92037; and
- ¶Center for Chemical Methodologies and Library Development, University of Pittsburgh, Pittsburgh, PA 15260
-
Communicated by Peter G. Wolynes, University of California at San Diego, La Jolla, CA, December 18, 2007 (received for review November 25, 2007)
Abstract
A kinetic and thermodynamic survey of 35 WW domain sequences is used in combination with a model to discern the energetic requirements for the transition from two-state folding to downhill folding. The sequences used exhibit a 600-fold range of folding rates at the temperature of maximum folding rate. Very stable proteins can achieve complete downhill folding when the temperature is lowered sufficiently below the melting temperature, and then at even lower temperatures they become two-state folders again because of cold denaturation. Less stable proteins never achieve a sufficient bias to fold downhill because of the onset of cold denaturation. The model, considering both heat and cold denaturation, reveals that to achieve incipient downhill folding (barrier <3 RT) or downhill folding (no barrier), the WW domain average melting temperatures have to be ≥50°C for incipient downhill folding and ≥90°C for downhill folding.
Energy landscape theory predicts that the entropic and enthalpic contributions to the free energy of a protein may be able to compensate for one another to the point where no significant (>3 RT) barrier appears along the folding reaction coordinate (1). Such folding is now referred to as “type 0” or “downhill” folding (1, 2). The possibility of downhill folding has been supported by a number of experiments (3–7). Kinetic measurements have focused on the transition from simple single exponential (two-state) to nonexponential (low barrier) back toward simpler (pure downhill) kinetics as the thermodynamic bias toward the native state is increased (3, 6, 8, 9). Thermodynamic measurements with probe-dependent baselines and transition temperatures suggest that downhill folding may be possible even at the melting temperature of a protein (10–12). Recently, two engineered proteins with identical melting temperatures were compared by both kinetic and thermodynamic criteria, showing that one can be classified as a two-state folder, whereas the other can be classified as a downhill folder (13). Such results have been debated extensively in the literature (14–17).
It has been suggested that the kinetics and thermodynamics of downhill folders must be very sensitive to sequence and environment because of the low barriers involved (18). Indeed, the fast folder lambda repressor has been shown to fold by either two-state, framework intermediate, or downhill mechanisms, depending on solvent conditions and sequence (6, 9, 13, 19, 20). Models for protein BBL, another fast folding protein, also indicate that it folds either in a two-state or downhill manner, depending on the exact sequence and solvent conditions (21, 22).
The diversity of observations suggests that criteria for downhill folding can be developed only by examining a large number of fast-folding proteins. Here, we take a survey approach to the experimental study of downhill folding. We examine a series of 35 engineered WW domains with variation in loops 1 and 2, where sequence changes have the largest effect on folding kinetics without disrupting the fold topology (23–25). These proteins exhibit a wide range of melting temperatures, T m (20–85°C) and span a 600-fold range of rate coefficients at the temperature of the maximum folding rate. Four of the fastest folders we studied switch from single exponential kinetics to having an additional fast molecular rate coefficient, k m, below T m, indicating incipient downhill folding (6).
We deduce a rather sharp transition from two-state folding to incipient downhill folding as a function of melting temperature and group proteins into three temperature zones proposed in our earlier work: apparent two-state folders, incipient downhill folders (<3-RT 0 barrier), and downhill folders (9, 13). These zones for T m can be understood by using a quantitative model that accounts for both heat denaturation (at T m) and cold denaturation (at T cd). As illustrated in Fig. 1, a very stable protein (Fig. 1 Left) might be a two-state folder near its T cd and T m but have a sufficiently strong native bias at temperatures between T cd and T m to become a downhill folder. A less stable protein (Fig. 1 Right) might barely achieve a sufficient native bias to become an incipient downhill folder or it might even remain a two-state folder over the entire temperature range between T cd and T m. Two-state/downhill/two-state transitions for stable proteins owe their existence to the temperature dependence of solvent-averaged interaction free energies, as was first discussed for phosphoglycerate kinase by Sabelko et al. (3). Our multiprotein survey enables us to predict at what characteristic melting temperatures to expect the transitions from two-state folding to incipient downhill folding, and finally to complete downhill folding. We hope that this work will stimulate tests of our prediction for two-state, incipient downhill, and downhill folding zones to see whether they apply quantitatively beyond the WW domain fold shown in Fig. 2.
Free energy difference between the native and the denatured state as a function of temperature. The temperature of maximal stability lies between the cold and heat denaturation temperatures. A very stable protein may reach downhill folding there, whereas a less stable protein may still fold over an activation barrier. Near the two melting points T cd and T m, both proteins fold over an activation barrier, as illustrated by the free energy surfaces G(x) as a function of reaction coordinate x shown at the bottom.
Results
Proteins.
To obtain meaningful statistics, we used engineered WW domains exhibiting a wide range of melting temperatures. Because loop 1 formation is rate-limiting for WW domain folding, we used WW domain sequences harboring both natural and unnatural sequences in lieu of the WT loop 1 sequence. In addition, we include a number of previously studied WW domains in our analysis (25–27). Table 1 shows the 35 WW Pin1-derived domain sequences we use in our survey. Further details are discussed in Materials and Methods.
Thirty-five proteins derived from hPin1 WW domain by substitutions in loop1 (residues 16–21), residue T29 in loop 2, and residue W34 in strand 3
Unfolding Thermodynamics Are Two-State at Tm.
Thermal titration curves (see Materials and Methods) were fitted to a two-state folding model as described in detail in ref. 24. The free energy difference between the native state and the denatured state was expanded about the melting temperature T
m as a quadratic function
The equilibrium constant is K = exp[−ΔG(T)/RT]. K is equal to 1 at T
m but also at the cold denaturation temperature
in this quadratic model. T
cd typically lies well below 0°C (Fig. 1 shows both T
m and T
cd). The CD and fluorescence thermal denaturation curves show that there are no probe-dependent differences near T
m. Even the fastest folders in this study undergo the Bryngelson et al. (1) type 0 to type 1 (downhill to two-state) transition when the native state is destabilized by raising the temperature toward
the heat denaturation transition.
T-Jump Kinetics.
Protein folding kinetics were measured with our home-built laser T-jump instrument in 10 mM phosphate buffer. The observed
relaxation was fitted by least squares to single exponential (A
m = 0) or stretched plus exponential models:
k
a and k
m are the observed activated and molecular rate coefficients, and β is the stretching factor for the molecular phase (1 for
normal diffusion, <1 for anomalous diffusion). The slower fitted activated rate coefficient was analyzed by a two-state model
k
a = k
f + k
u, where k
f = k
a
K/(1 + K) and k
u denote the activated folding and activated unfolding rate coefficients. The Arrhenius plots of the activated folding rate
could all be fitted to quadratic polynomials, yielding a maximum folding rate coefficient k
f(T
max) = τf
−1 at the temperature T
max, which lies between T
cd and T
m for almost all proteins (Table 1). As discussed previously, downhill folding dynamics characterized by k
m can be stretched or simple exponential (9, 17, 26, 27). A fit within the experimental signal-to-noise ratio required τm = 1.4 μs and the stretching factor β = 0.4 in Fig. 3. Reasonable fits could be obtained by setting β = 1, thus characterizing the molecular phase by an average time constant.
The resulting τm = 1/k
m are listed in Table 2 for the four molecules with A
m > 10%. Note that under incipient downhill conditions, where the fast molecular phase k
m takes over from the activated phase k
a, K ≫ 1 and the downhill folding rate equals k
m according to ref. 6.
Protein relaxation kinetics of variant 20 recorded with a time resolution of 280 ns. (A) Relaxation is fitted very well to a single exponential decay when temperature jumped to T m. (B) The stretched plus exponential fit becomes better than a single exponential fit when the temperature jumps to 19°C below T m.
Incipient downhill folders: Temperature of fitted kinetics in Fig. 3, molecular time scale, and activated folding time
Correlation Between Maximal Folding Rate and Tm.
Mutants with higher stability have higher maximum folding rates on average (Fig. 4). The correlation between ln(τf(T max)) and T m is sigmoidal, as indicated by the dotted envelopes. A transition to faster folding occurs at T m* ≈ 50°C. Near T m*, as indicated by the vertical/red dotted line in Fig. 4, the folding time τf(T max) has a very broad distribution, ranging from 10 μs (variant 20) to 758 μs (variant 15). The distribution of τf(T max) is much narrower above this temperature. Folding times above T m* are on average more than an order of magnitude faster than for proteins below that transition temperature. The minimum activated folding time of all of the mutants is 10 μs (variant 20 with a T m of 64°C). The folding time of variant 21 with the highest T m (85°C) is 14 μs at the lowest temperature where we could measure it.
The correlation between the minimum activated folding time (τf at the temperature of maximum folding rate) versus the melting temperature T m. All of the data are shown in Table 1. Blue circles are variants 1–28, and red circles are variants 29–35 measured here, which include non-natural amino acids loops. The triangles correspond to double exponential fits to Eq. 3 for the four proteins of Table 2. The upper triangle of a pair is the activated folding time, and the lower triangle of a pair is the molecular (downhill) time scale.
Four Among the Fastest Folders Make a Transition from Exponential to Nonexponential Kinetics at High Native Bias.
At temperatures below T m, but well above the cold denaturation temperature T cd, we identified a transition from exponential to nonexponential kinetics for four mutants. All of them have high melting temperatures. At their T m, these proteins fold with single exponential relaxation kinetics. Nonexponential relaxation begins when the temperature is lowered at least 10–20°C below T m. At even lower temperature, the cold denaturation process illustrated in Fig. 1 presumably would lead back to two-state folding, but we cannot measure kinetics at the required subfreezing temperatures.
Fig. 3 shows the onset of a single exponential-to-nonexponential transition: Protein variant 20 has a minimum folding time of 10 μs near its T m. Its relaxation kinetics can be fitted to a single exponential at T m. When the temperature decreases to 19°C below T m, the double exponential begins to fit the data significantly better than the single exponential fit. For pure downhill folding, the prediction (9) is that the activated and molecular phases merge. We observe only incipient downhill folding (barrier <3 RT), where both phases are still observed, at the temperatures we were able to reach. The fitted molecular rate ranges from (1–5 μs)−1, as summarized in Table 2.
The Molecular Phase for Diffusive Relaxation Becomes Faster at Higher Temperature.
The fitted stretching coefficient ranges from 0.4 to 1. If we fix it to 1, the molecular rate increases from k m = (4–5 μs)−1 to (1–2 μs)−1 at higher temperatures (Table 2). The increase is slightly larger than the viscosity scaling expected in the normal Kramers regime (k m ∼ η−1). Solvent viscosity scaling only accounts for a decrease of about 40% between 40°C to 70°C. However, the k m values extracted are not reliable enough currently to allow a quantitative comparison with models of viscosity scaling for the prefactor (28).
Discussion
Downhill folding has been investigated experimentally by looking at thermodynamic or kinetic anomalies of a few individual proteins. A statistical analysis covering a wide range of stabilities and folding rates reported here using 35 WW domains yields more concrete insight into the requirements for downhill folding. The salient observations that need to be explained are as follows. Why is there a change from only two-state folders being observed at T m < 50°C to incipient downhill folding at T m between 50 and 85°C? Why is this change so sharp? And why has no pure downhill folder been observed below T m = 85°C? We begin the discussion with a model that allows us to define protein stability zones for “two-state,” “incipient downhill,” and “downhill” folding. Finally, we discuss some of the features of the molecular rate coefficients k m and their association with downhill folding in more detail.
Fig. 4 shows the two-state, incipient downhill, and downhill zones according to the rate criterion of Yang and Gruebele (6). Proteins with a single slow phase k a are apparent two-state folders. Proteins showing two phases (triangles in Fig. 4) are incipient downhill folders: The slow phase is the remnant of activated folding, and the fast molecular phase k m results from a sizeable protein population diffusing over a low barrier (<3 RT) (6, 9). Proteins with a single fast phase k m would be pure downhill folders.
It appears from Fig. 4 that such zones can also be defined by a single thermodynamic criterion; namely, T m. The transition at ≈50°C suggests that incipient downhill folding cannot be observed below that temperature. Moreover, the transition to complete downhill folding appears to lie at or above 85°C, because even the highest T m folder shows traces of both activated and molecular phases.
A fitting model that includes both heat denaturation and cold denaturation explains why less stable proteins cannot be made to fold downhill. As illustrated in Fig. 1 and Eq. 1, the native bias is a quadratic function of temperature: Proteins not only heat-denature at T m, but also cold-denature at T cd (Eq. 2). If a protein is very stable, the region between the two denaturation temperatures is wide enough to allow downhill folding near its center. If a protein has a low melting point, then the region between T m and T cd is small, so a strong native bias cannot be achieved, and the protein might even remain a two-state folder everywhere between T m and T cd. This hypothesis was introduced earlier, qualitatively, by Sabelko and coworkers (3); however, the present survey provides sufficient data to test this notion quantitatively. Specifically, we show that the characteristic temperatures T m* ≈ 50°C (two-state to incipient downhill) and T m* ≥ 85°C (incipient downhill to pure downhill) can be predicted from our data.
We approximate the free energy profile as a function of a collective reaction coordinate x (not specified structurally) and temperature T by a sum of two terms:
The first term represents a symmetric double-well at T
m (Fig. 1, bottom), when ΔG = 0:
ΔG
† is the barrier between the native and denatured states when ΔG = 0. The second term in Eq. 4 is the free energy difference between the folded and denatured ensembles, assuming the reaction coordinate x is normalized so x = −1 in the denatured states and x = +1 in the native state. Because heat and cold denaturation are both possible, ΔG(T) is a quadratic function of temperature (Eq. 1).
Finally, we need an expression for the activation energy ΔG
† in Eq. 5. A plot of the activated folding k
f
−1(T
m) in Fig. 5
A shows that ΔG
† decreases as the melting temperature of the protein increases. We approximate its dependence on the T
m by
The first term is the average barrier of mutants at T
0 = 298 K (for reference), λ† describes the linear dependence on the melting temperature T
m, and ΔG
random
† describes random sequence-dependent fluctuations of the barrier.
Two-state to downhill folding transition model. (A) The activation barrier at T m of the 35 variants of WW protein, ΔG †, is approximated by a linear function of T m. (B) Calculated T downhill for incipient downhill folding (3-RT 0 barrier, red) and complete downhill folding (0-RT 0 barrier, blue) as a function of melting temperature T m based on the model. (C) Free energy surfaces of the proteins with low melting temperature and high melting temperature. Only proteins of sufficient stability (naturally biased proteins) undergo complete downhill folding. N, native; C, cold denatured; H, heat denatured state.
Numerical values for the constants in Eqs. 1, 2, and 4–6 can be estimated quite accurately from the experimental data. Neglecting random fluctuations, ΔG † ≈ 5.5 − 0.046 (T m − T 0) from Fig. 5 A (units of RT 0). WW domains have similar cooperativity and reach the maximum stability at ≈5°C; thus, T cd ≈ 10°C − T m according to Fig. 1. We obtain ΔS(T m) ≈ −0.11 − 6.8 · 10−4 T m (in units of RT 0/Kelvin) from the thermodynamic fit of Eq. 1 to the circular dichroism data. The sign of the slope is as expected if hydrophobic interactions are weakened at lower temperature. ΔC p can be evaluated from Eq. 2.
To compute the downhill folding temperature, T downhill, for a given T m, one evaluates Eq. 4 and finds the temperature for which the folding barrier drops below 3 RT 0 (incipient downhill folding) or for which the barrier is exactly equal to 0 (completely downhill folding). Fig. 5 B shows the resulting plot of T downhill vs. T m for incipient and completely downhill folding, with cold denaturation included (solid curves) or neglected (dashed lines). When the possibility of cold denaturation is neglected, downhill folding always occurs at sufficiently low temperature. When cold denaturation is included, our model yields T m* ≈ 50°C for incipient downhill folding (red curve in Fig. 5 B) and T m* ≈ 90°C for complete downhill folding (blue curve in Fig. 5 B).
These results agree well with the experimental survey: No incipient downhill folders are observed that have a T m below 52°C, and no complete downhill folders are observed up to a T m of 85°C. In essence, the onset cold denaturation prevents downhill folding from occurring in insufficiently stable proteins. The two free energy surfaces in Fig. 5 C computed from our model illustrate the extremes of behavior. Fig. 5 C Upper corresponds to a melting temperature below the T m* for incipient downhill folding, and a barrier >3 RT 0 occurs at all temperatures between T cd and T m. The surface in Fig. 5 C Lower corresponds to T m ≈ 90°C > T m* for completely downhill folding, and no barrier occurs near (T cd + T m)/2, the temperature of maximal stability. For T > 1.1 T m or T < 0.8 T cd, downhill unfolding is obtained.
Shen et al. (29) recently studied downhill folding and unfolding temperatures for a large number of two-state folders with a native structure-based model that excludes cold denaturation. Fig. 5 B (blue dotted line) shows our fitting results with cold denaturation excluded (no quadratic term in ΔG). We find that for a WW domain of typical stability (T m ≈ 50°C) to fold downhill, T downhill/T m ≈ 229 K/323 K ≈ 0.71, compared with their estimate of <0.58 for the proteins they studied. Our estimate for the onset of downhill heat denaturation is 1.29 T m, compared with their prediction of 1.25 T m. The upper temperature ratio is in very good agreement between experiment and theory, while downhill folding turns out to be significantly easier experimentally than the calculation would indicate. It will be very interesting to see what results are obtained from statistical mechanical folding models when the temperature dependence for the interaction parameters is included.
We conclude our discussion with the properties of the molecular rate coefficient k m. The best fits (e.g., Fig. 3) were obtained by allowing anomalous diffusion with β ≤ 1, but to characterize downhill folding by a single average rate coefficient, we report fits with β fixed to 1 in Table 2. According to Yang and Gruebele (6), the average rate coefficient can be taken as the prefactor for activated kinetics. Values greater than 0.1 μs indicate reduced diffusion caused by longitudinal (along the reaction coordinate) or transverse (along other coordinates) roughness of the free energy surface (30). In a recent article, Qi and Portman (31) enhanced the ability of native topology-based folding models to predict the full range of folding rates by addition of excluded volume effects. From this model, they estimate a more realistic range of prefactors for the collective reaction coordinate, 1–5 μs. Our range of fitted molecular phase rate coefficients for the four proteins in Table 2 is in good agreement with this estimate, as well as earlier models and experimental estimates of the “speed limit” (32–34).
The fitted k m values of this small β-sheet protein (34 residues) are comparable to those of the larger helical downhill folder λ6–85* (82 residues). They are slower at lower temperature, in qualitative agreement with viscosity-dependent scaling of the diffusion coefficient. The scatter in Table 2 is too large to fit any specific viscosity model. For the folding speed limit, both linear scaling with chain length (N/100 μs) (2) and exponential scaling with contact order (18) have been proposed. The k m ≈ 1–2 μs observed for the helical λ6–85* is very close to the limit of 0.8 μs estimated by linear scaling. For the Pin WW β-sheet domain, the measured molecular phase ranges from 1 to 5 μs, 3–10 times slower than the linear chain length scaling model (0.36 μs).
In summary, we have derived a simple thermodynamic model from which melting temperature zones for two-state folding and downhill folding can be determined. Application of the model to the folding kinetics of hPin WW domain yields quantitative agreement with the experimental minimum temperature required for incipient downhill folding (50°C) and complete downhill folding (≥85°C). It remains to be tested how generally this model applies to the boundary between two-state and downhill protein folding.
Materials and Methods
Protein Engineering.
Our main focus in the survey is on loop 1 of the Pin WW domain (entry 1, Table 1). This loop adopts a rare (4:6) loop conformation (Fig. 2), harboring a type-II four-residue turn sequence (–Ser-16–Arg-17–Ser-18–Ser-19–) incorporated into the six-residue loop sequence (–Ser-16–Arg-17–Ser-18–Ser-19–Gly-20–Arg-21–), apparently for functional reasons (25). Sequence alignments of over 100 WW domain family members reveal that a five-membered type 1 G1 bulge turn is the most common loop 1 structure. Given the importance of loop 1 in the folding transition state of numerous WW domain sequences, we have varied this loop in several studies to four-, five-, and six-residue loop sequences to understand structure–function–folding relationships in WW domains (25, 35). Some Pin-derived WW domain sequences exhibit an additional slower concentration-dependent phase, apparently resulting from transient reversible oligomer formation; thus, we have used Thr29Ala and/or W34F mutations to eliminate this phase. Table 1 depicts several six-, five-, and four-residue sequences prepared to test numerous hypotheses about the importance of this region in folding and function. The last six entries are part of a more extensive survey of the efficacy of β-turn mimetics prepared by the organic chemistry community over the last two decades (A.A.F., D.D., F.L., J.E.D., Gerard Kroon, Evan T. Powers, P.W., M.G., and J.W.K., unpublished results). In particular, the last six WW sequences (entries 30–35) incorporate trisubstituted E-olefin dipeptide isosteres in place of two α-amino acid residues. Protein expression and purification methods are as described in ref. 25. Protein identity was confirmed by low-resolution ESI and MALDI mass spectrometry, and purity was established chromatographically. For variants 30–35, trisubstituted alkene dipeptide isosteres in which the amide bond was replaced with Ψ[(E)-C(CH3)=CH] were incorporated into the protein by using methods described in refs. 36 and 37.
Thermal Titrations and Melting Temperatures.
Circular dichroism (CD)-detected thermal titration curves were obtained by monitoring at 227 nm. Lyophilized protein samples were dissolved in 10 mM phosphate buffer (pH 7) with concentrations near 10 μM. The temperature scans ranged from 2°C to 108°C (under an oil film for mutants with T m > 75°C).
Far-UV CD spectra were similar for all of the mutants under native conditions (data not shown). Furthermore, the crystal structures of several mutants (variant 19 and 26) have been solved and are superimposable on the wild-type Pin WW domain structure, except for loop 1 (25), the substructure of which was purposefully altered. Substantial biophysical data not depicted here are also consistent with the hypothesis that all mutants conserve the native structure of the wild-type Pin1 WW domain, except for local perturbations in loop 1 upon modification of its sequence.
Laser T-Jump Kinetics.
Protein folding kinetics were measured in 20% D2O/80% H2O, 10 mM phosphate buffer. The concentration of the protein samples ranged from 80 to 150 μM to avoid any transient aggregation (6). A Raman-shifted YAG laser pulse generated a temperature jump of 5–12°C within several nanoseconds. The relaxation of the protein to the new equilibrium was probed by tryptophan fluorescence following excitation of the protein by 280-nm UV laser pulses. The fluorescence decays were digitized with 500-ps time resolution. Several decays were averaged, and data were binned into 140- to 280-ns bins to improve the signal-to-noise ratio. The evolution of the fluorescence lifetime profile was extracted by χ analysis as described in ref. 38, to reveal deviations from two-state folding via nonexponential decays (Fig. 3). The temperature range of the experiments was limited by the low signal-to-noise ratio at temperatures far below T m and by photoacoustic cavitation at temperatures well above T m.
Acknowledgments
We thank Drs. Houbi Nguyen and Marcus Jäger for measuring the kinetics and thermodynamics of WW domains from ref. 39. This work was supported by National Science Foundation Grant MCB 0313643 (to F.L. and M.G.), National Institutes of Health Grant GM 051105, the Skaggs Institute of Chemical Biology (D.D., A.A.F., and J.W.K.), the Lita Annenberg Hazen Foundation (D.D., A.A.F., and J.W.K.), and National Institutes of Health P50 Grant GM067082 (to J.E.D. and P.W.). A.A.F. is supported by a Ruth L. Kirschstein National Research Service Award fellowship.
Footnotes
- ‡‡To whom correspondence should be addressed. E-mail: gruebele{at}scs.uiuc.edu
-
Author contributions: J.W.K. and M.G. designed research; F.L., D.D., A.A.F., and J.E.D. performed research; J.E.D. and P.W. contributed new reagents/analytic tools; F.L. and M.G. analyzed data; and F.L., J.W.K., and M.G. wrote the paper.
-
The authors declare no conflict of interest.
- © 2008 by The National Academy of Sciences of the USA




