Article
References
Full text PDF
(37 KB)
Physics approaches focus on uncovering, modeling and
quantitating the general principles governing the micro and macro
universe. This has always been an important component of biological
research, however recent advances in experimental techniques and
the accumulation of unprecedented genome-scale experimental data
produced by these novel technologies now allow for addressing
fundamental questions on a large scale. These relate to molecular
interactions, principles of bimolecular recognition, and mechanisms
of signal propagation.
The functioning of a cell requires a variety of intermolecular
interactions including protein–protein, protein–DNA,
protein–RNA, hormones, peptides, small molecules, lipids and
more. Biomolecules work together to provide specific functions and
perturbations in intermolecular communication channels often lead
to cellular malfunction and disease. A full understanding of the
interactome requires an in-depth grasp of the biophysical
principles underlying individual interactions as well as their
organization in cellular networks.
Phenomena can be described at different levels of abstraction.
Computational and systems biology strive to model cellular
processes by integrating and analyzing complex data from multiple
experimental sources using interdisciplinary tools. As a result,
both the causal relationships between the variables and the general
features of the system can be discovered, which even without
knowing the details of the underlying mechanisms allow for putting
forth hypotheses and predicting the behavior of the systems in
response to perturbation. And here lies the strength of
in silico models which provide control and predictive power.
At the same time, the complexity of individual elements and
molecules can be addressed by the fields of molecular biophysics,
physical biology and structural biology, which focus on the
underlying physico-chemical principles and may explain the
molecular mechanisms of cellular function.
In this issue we have assembled a representative set of papers
written by experts with diverse scientific backgrounds, each
offering a unique viewpoint on using computational and physics
methods to study biological systems at different levels of
organization. We start with studies that aim to decipher the
mechanisms of molecular recognition using biophysics methods and
then expand our scale, concluding the issue with studies of
interaction networks at cellular and population levels.
Biomolecules interact with each other in a highly specific
manner and selectively recognize their partners among hundreds of
thousands of other molecules. As the paper by Zhang
et al points out, this recognition process should be fast
and guided by long-range electrostatic forces that select and bring
the interacting partners together. The authors show that the
increase of salt concentration leads to destabilization of protein
complexes, suggesting an optimization of the charge–charge
interactions across the protein binding interfaces. The following
paper by Berezovsky further explores the balance of different
interactions in protein complexes and uses physical concepts to
explain the entire spectrum of protein structural classes, from
intrinsically disordered to hyperthermostable proteins. The author
describes highly unstructured viral proteins at one end of the
spectrum and discusses the balance of stabilizing interactions in
protein complexes from thermophilic organisms at the other.
Recently accumulated evidence has indicated that native proteins
do not necessarily require a unique structure to be biologically
active, and in some cases structural disorder or intrinsic
flexibility can be a prerequisite for their function. From the
physical point of view, these disordered/flexible proteins exist in
dynamic equilibrium between different conformational states, some
of which could be selected upon binding to another partner. Such a
property allows disordered proteins to achieve specific binding and
at the same time reversibility and diversity in their interactions.
Interestingly, as is shown in the paper by Mészáros
et al, even though some disordered regions and proteins have
a tendency to fold upon binding, the structures of their complexes
still reveal their inherent flexibility. Indeed, disordered
proteins and their complexes have certain properties which
distinguish them from proteins with well-defined structures. This
is evident from the papers by Lobanov and Galzitskaya, and
Mészáros
et al, which show that such characteristic features of
disordered proteins allow their successful computational prediction
from the sequence alone. Computational prediction of protein
disorder has been used in another study by Takeda
et al where the authors investigate the role of disorder in
the function of a specific actin capping protein. The paper
presents normal mode analysis with the elastic network model to
examine the mechanisms of intrinsic flexibility and its biological
role in actin function.
Analysis of the underlying mechanisms and key factors in protein
recognition might be essential for the prediction of
protein–protein interactions. The papers by Tuncbag
et al and Hashimoto
et al demonstrate how incorporating the physico-chemical
properties of binding interfaces and their atomic details obtained
from protein crystal structures might be used to increase the
accuracy of predicted protein–protein interactions and
provide data on relative orientations of interacting proteins and
on the locations of binding sites. Moreover, analysis of
protein–protein interactions might require further
fine-tuning for different types of assemblies, like that shown in
the example of homooligomers by Hashimoto
et al.
Studies of protein–protein interactions at the molecular
level have contributed considerably to understanding the principles
of large-scale organization of the cellular interactome. Using
graph theory as a unifying language, many characteristic properties
of bimolecular networks have been identified, including scale free
distribution of the vertex degree, network motifs, and modularity,
to name a few. These studies of network organization require the
network to be as complete as possible, which given the limitations
of experimental techniques is not currently the case. Therefore,
experimental procedures for detecting biomolecular interactions
should be complemented by computational approaches. The paper by
Lees
et al provides a review of computational methods,
integrating multiple independent sources of data to infer physical
and functional protein–protein interaction networks. One of
the important aspects of protein interactions that should be
accounted for in the prediction of protein interaction networks is
that many proteins are composed of distinct domains. Protein
domains may mediate protein interactions while proteins and their
interaction networks may gain complexity through gene duplication
and expansion of existing domain architectures via domain
rearrangements. The latter mechanisms have been explored in detail
in the paper by Cohen-Gihon
et al.
Protein–protein interactions are not the only component of
the cell's interactome. Regulation of cell activity can be achieved
at the level of transcription and involve a transcription
factor—DNA binding which typically requires recognition of a
specific DNA sequence motif. Chip-Chip and the more recent Chip-Seq
technologies allow
in vivo identification of DNA binding sites and, together
with novel
in vitro approaches, provide data necessary for deciphering
the corresponding binding motifs. Such information, complemented by
structures of protein–DNA complexes and knowledge of the
differences in binding sites among homologs, opens the door to
constructing predictive binding models. The paper by Persikov and
Singh provides an example of such a model in the Cys
2His
2 zinc finger family.
Recent studies have indicated that the presence of such binding
motifs is, however, neither necessary nor sufficient for
transcription factor activity. Transcription regulation is a
complex and still not fully understood process involving, in
addition to protein–DNA binding, other factors such as
epigenetic modifications and three-dimensional DNA organization. In
this issue, Levens and Benham discuss another important mechanism
which is likely to contribute to overall gene
regulation—changes of DNA secondary structure in response to
supercoiling-induced stress. Pointing out that DNA is "more than a
cipher", they argue that the DNA structural transitions driven by
negative supercoiling may have profound consequences for the cell
and have to be accounted for in detailed models. There is
considerable progress in physical modeling of DNA dynamics in
response to stress. Such efforts, supported by experimental data,
will bring us closer to an understanding of the role of
supercoiling in gene regulation.
Large-scale biomolecular interaction networks not only provide a
system-level view of cellular processes, but are also increasingly
used to model communications between molecules. The lack of
sufficient biochemical data and the gigantic scale of the network
prevented detailed modeling of network dynamics and have stimulated
the development of simplified models such as the information flow
approach described by Kim
et al in this issue. Importantly, despite their simplicity,
such models proved to be extremely useful for identifying network
modules, essential nodes, and molecular pathways which are
dysregulated in complex diseases such as cancer.
Finally, moving from studies of single cells towards
populations, one has to recognize the heterogeneity present within
a population of cells. In the context of protein abundance, such
cell-to-cell variation within clonal populations of cells, referred
to as expression noise, has recently become a focus of intense
cross-disciplinary research. Concerted efforts of experimentalists,
physicists and mathematicians have brought us closer to
understanding the source, potential drawbacks and benefits of noise
for cell function. Differences in protein expression levels are
even more pronounced in samples from mixed cell populations. How
does such a mixture of cell populations affect the measurements of
total gene expression? This question is addressed by Hebenstreit
and Teichmann who show that decomposing a signal coming from a
mixture of cellular populations requires insights from theoretical
modeling.
Recent technological advancements permitting genome-wide scale
measurements of diverse molecular properties and consequently
higher levels of quantitative reasoning are attracting physicists,
mathematicians and computer scientists to the study of biological
systems. Building on the synergy between these fields, we are
entering an exciting era where physics methods are used in
conjunction with these disciplines which, combined with statistical
methods, provide quantitative descriptions of biology.
Acknowledgments
This project was funded with federal funds from the National
Cancer Institute, National Institutes of Health, under contract
number HHSN261200800001E. This research was supported by the
Intramural Research Program of the NIH, National Cancer Institute,
Center for Cancer Research and the National Library of Medicine at
National Institutes of Health/DHHS. The content of this publication
does not necessarily reflect the views or policies of the
Department of Health and Human Services, nor does mention of trade
names, commercial products or organizations imply endorsement by
the US Government.