Scientific Activities
Protein
Folding
and Aggregation
Minimalist
Reduction
|
|
| We present
the results of sequence design on our off-lattice minimalist
model in which no specification of native-state tertiary contacts is
needed.
We start with a sequence that adopts a target topology and build upon
it
through sequence mutation to produce new sequences that comprise
distinct
members within a target fold class. In this work we use the α/β
ubiquitin
fold class, and design two new sequences which, when characterized
through
folding simulations, reproduce the differences in folding mechanism
seen
experimentally for proteins L and G. These results indicate that a
basic
rule of patterning of hydrophobic and hydrophilic residues is the
physical
origin for the success of relative contact-order descriptions of
folding,
and that the patterning is tolerant to a small number of mutations that
would
manifest itself as residues that are poorly conserved being found in
the
folding nucleus, while being consistent with the robustness of fold
topologies
to mutation. We also suggest a possible criteria for performing
sequence
mappings from a 20-letter amino-acid code to a 3-letter reduced code
for
generalization to protein design. |
 |
Computational
Methods, Algorithms, Models
|
Monte Carlo
Algorithms |

|
Effective relaxation processes
for difficult systems like proteins or
spin glasses require special simulation techniques that permit
barrier crossing to ensure ergodic sampling. Numerous adaptations
of
the venerable Metropolis Monte Carlo (MMC) algorithm have been proposed
to
improve its sampling efficiency, including various hybrid Monte Carlo
(HMC)
schemes, and methods designed specifically for overcoming
quasi-ergodicity problems such as Jump Walking (J-Walking), Smart
Walking (S-Walking), Smart Darting, and Parallel Tempering.
We
present an alternative to these approaches that we call Cool Walking,
or C-Walking. In C-Walking two Markov chains are propagated in
tandem, one at a high (ergodic) temperature and the other at a low
temperature. Non-local trial moves for the low temperature walker
are generated by first sampling from the high-temperature distribution,
then performing a statistical quenching process on the sampled
configuration to generate a C-Walking jump move. C-Walking needs
only one high-temperature walker, satisfies detailed balance, and
offers the important practical advantage that the high and
low-temperature walkers can be run in
tandem with minimal degradation of sampling due to the presence of
correlations.
|
To make
the C-Walking approach more suitable to
real problems we decrease the required number of cooling steps by
attempting to jump at intermediate temperatures during cooling.
We further reduce the number of cooling steps by utilizing “windows” of
states when jumping, which improves acceptance ratios and lowers the
average number of cooling steps. We present C-Walking results
with comparisons to J-Walking, S-Walking, Smart Darting and Parallel
Tempering on a one-dimensional rugged potential energy surface in which
the exact normalized probability distribution is known. C-Walking shows
superior sampling as judged by two ergodic measures.
|
| Implicit Solvent
Model |
|
|
We have developed a solvation
function that combines a Generalized Born
model for polarization of protein charge by the high dielectric
solvent, with a hydrophobic potential of mean force () as a model
for hydrophobic interaction, to aid in the discrimination of native
structures from other misfolded states in protein structure prediction.
We find that our energy function outperforms other reported scoring
functions in terms of correct native ranking for 91% of proteins and
low Z-scores for a variety of decoy sets including the challenging
Rosetta decoys. Decoys generated by thermal sampling around the native
state basin reveal a potentially important role for side chain entropy
in future development of even more accurate free energy surfaces.
We
also demonstrate the performance of the new implicit solvent model on
native protein loop prediction from a large set of loop decoys of 4- to
12-residue lengths. While our results for small loop decoy sets are
comparably good to existing energy functions, we find demonstrable
superiority for loop lengths of 8-residues and greater, and that the
quality of our predictions are largely insensitive to the length of the
target loop on a filtered set of decoys. Given that the current
weakness in loop modeling is the ability to select the most native-like
loop conformers from loop ensembles, this energy function provides a
means for greater prediction accuracy in structure prediction of
homologous and distantly related proteins, thereby aiding large-scale
genomics efforts in comparative modeling. Together this work shows that
the stabilizing effect of hydrophobic exposure to
aqueous solvent that defines the hydration physics is an apparent
improvement over solvent accessible surface area models that penalize
hydrophobic exposure.
|

|
|
Distributed
Computing
The distributed computing (DC)
paradigm in conjunction with the
folding@home (FH) client server has been used to study the folding
kinetics of small peptides and proteins, giving excellent agreement
with experimentally measured folding rates, although pathways sampled
in these simulations are not always consistent with the folding
mechanism. In this study, we use a coarse-grain model of protein L,
whose two-state kinetics have been characterized in detail by using
long-time equilibrium simulations, to rigorously test a FH protocol
using approximately 10,000 short-time, uncoupled folding simulations
starting from an extended state of the protein.
We show that the FH
results give non-Poisson distributions and early folding events that
are unphysical, whereas longer folding events experience a correct
barrier to folding but are not representative of the equilibrium
folding ensemble. Using short-time, uncoupled folding simulations
started from an equilibrated denatured state ensemble (DSE), we also do
not get agreement with the equilibrium two-state kinetics because of
overrepresented folding events arising from higher energy
subpopulations in the DSE. The DC approach using uncoupled short
trajectories can make contact with traditionally measured experimental
rates and folding mechanism when starting from an equilibrated DSE,
when the simulation time is long enough to sample the lowest energy
states of the unfolded basin and the simulated free-energy surface is
correct. However, the DC paradigm, together with faster time-resolved
and single-molecule experiments, can also reveal the breakdown in the
two-state approximation due to observation of folding events from
higher energy subpopulations in the DSE.
Coarse
Grained Protein Models
We have recently developed a sequence
based α−carbon model to
incorporate a mean field estimate of the orientation dependence of the
polypeptide chain that give rise to specific hydrogen bond pairing to
stabilize α−helices and β−sheets. We illustrate the success of the new
protein model to improve on thermodynamic measures and folding
mechanism of proteins L and G. The model shows greater folding
cooperativity and improvements in designability of protein sequences,
as well as predicting correct trends for kinetic rates and mechanism
for proteins L and G. We believe the model is broadly applicable to
other protein folding and protein-protein co-assembly processes, and
does not require experimental input beyond the topology description of
the native state. Even without tertiary topology information, it can
also serve as a mid-resolution protein model for more exhaustive
conformational search strategies that can bridge back down to atomic
descriptions of the polypeptide chain.
We present a new general analytical solution for computing the screened
electrostatic interaction between multiple macromolecules of
arbitrarily complex charge distributions, assuming they are well
described by spherical low dielectric cavities in a higher dielectric
medium in the presence of a Debye-Hückel treatment of salt. The
benefits to this approach are threefold. First, by exploiting multipole
expansion theory for the screened Coulomb potential, we can describe
direct charge-charge interactions and all significant higher-order
cavity polarization effects between low dielectric spherical cavities
containing their charges, while treating these higher order terms
correctly at all separation distances. Second, our analytical solution
is general to arbitrary numbers of macromolecules, is efficient to
compute, and can therefore simultaneously provide on-the-fly updates to
changes in charge distributions due to protein conformational changes.
Third, we can change spatial resolutions of charge description as a
function of separation distance without compromising the desired
accuracy. While the current formulation describes solutions based on
simple spherical geometries, it appears possible to reformulate these
electrostatic expressions to smoothly increase spatial resolution back
to greater molecular detail of the dielectric boundaries.
Bulk Water and
Aqueous Hydration
Water structure
controversy
It has been suggested, based on
x-ray absorption spectroscopy (XAS) experiments on liquid water (Wernet
et al, Science 2004) that each water molecule, on average, has only one
hydrogen bond donor and in turn accepts only one hydrogen bond. The
larger implication of the XAS result is that the conventional view of
water organizing as a four-fold tetrahedral coordinated random network
is not true, and instead water organizes as hydrogen-bonded chains or
large rings embedded in a weakly hydrogen-bonded disordered network.
This is a radical departure from what is known about liquid water,
which is thought to belong to the class of tetrahedral liquids such as
silica and germanium.
This alternative structural view potentially impacts previous
interpretations of experimental and theoretical work on water, ice,
tetrahedral and associated liquids, and educators who teach students
about hydrogen-bonding of the world’s most important liquid and
chemical bonding in general. Given the importance of water as a
solvent, there are also broad implications for biological molecules,
the design of novel materials, and experimental probes that yield
fundamental signatures of water as evidence of life on other planets.
Given the broader scientific and educational context, radically
alternative structural interpretations of liquid water need to be
challenged.
A vast array of experimental data on water provides a global view of
the liquid that implicates its tetrahedral hydrogen-bonding network as
the unifying molecular connection to its observed structural,
thermodynamic, and dielectric property trends with temperature. Anyone
who advocates an alternative structural picture for liquid water must
consider this other, non-structural, data. Although we firmly did not
think it possible that chain networks could be consistent with these
known liquid water trends with temperature, there is no existing
evidence to directly refute such a possibility. Therefore we decided to
examine the consequences of. chain networks using three different
modified water models that exhibit a local hydrogen-bonding environment
of two hydrogen-bonds (2HB) and therefore networks of chains. Using
these very differently parameterized models we evaluate their bulk
densities, enthalpies of vaporization, heat capacities, isothermal
compressibilities, thermal expansion coefficients, and dielectric
constants, over the temperature range of 235K-323K. We also evaluate
the entropy of the 2HB models at room temperature and whether such
models nucleate ice Ih. All show poor agreement with experimentally
measured thermodynamic and dielectric properties over the same
temperature range, and behave similarly in most respects to normal
liquids. This is to be contrasted by many modern simulation models of
water that reproduce experimentally determined thermodynamic, dynamic,
and dielectric trends with temperature. These models yield liquid
structure that shows significant tetrahedral order and increased
hydrogen-bonding than advocated by Wernet et al. Thus it appears that
water structure based on hydrogen-bonded chains is inconsistent with
liquid water as we know it through a multitude of experiments.
An alternative structure for liquid water based on chain networks
should certainly have been anticipated to be controversial, but seemed
to go directly at the goal of overturning conventional wisdom. Similar
scientific excitement must have existed in the early days of the
discovery of “polywater”, but eventually that alternative view proved
to be false because the characterized water contained chemical
impurities that ultimately explained polywater’s unusual, and
un-water-like, properties. It would seem that the challenge of knowing
whether a new structural view of water is correct is to first try to
reconcile whether it “fits” into a larger experimental context of other
structural, thermodynamic, dynamical, and dielectric data collected
over decades by many able scientists. Theory and simulation can
more directly address “what is” the global view of the liquid and its
phases, thereby providing a better reference state for investigating
“what is not”.