|
Sung-Hou
Kim and Igor Grigoriev,
Lawrence
Berkeley National Laboratory
 |
|
|
|
Using
sensitive fold recognition methods (PCM and DuP) allows us to assign
folds for more proteins encoded in the complete genome of yeast,
S. cerevisiae.
|
|
Research
Objectives
Our
objective is to use computation to obtain the basic structural set for
the organisms with completely sequenced genomes. We identify and classify
protein folds in complete genomes to find new targets for structural analysis,
predict protein structures when possible, and use computational tools
for solving protein structures from X-ray data.
Computational
Approach
We
have developed two different methods, Proximity Correlation Method (PCM)
and Dual Profile method (DuP), for detecting similarity of protein folds
by comparison of protein sequences. In these methods we combine secondary
structure predictions with correlation of physical (in PCM) or evolutionary
(in DuP) properties of amino acid residues. Segments of sequences rather
than single residues are compared, which substantially increases sensitivity
in detecting remote homologues, i.e., proteins with similar folds and
low sequence similarities that cannot be detected by standard sequence
comparison techniques. For ab initio structural predictions, we
derived new energy potentials for contacts between amino acid residues
in protein structures and developed a procedure for efficient structure
calculation by torsion angle dynamics.
Accomplishments
Pilot projects for protein fold prediction and comparison have been conducted
for a few complete genomes. Detailed analysis of protein fold topology
allowed us to extend a library of protein fold templates and increase
the number of predictions in complete genomes. Results of the pilot projects
revealed the folds of several hypothetical proteins in the Methanococcus
jannaschii genome, clusters of proteins with same the fold or fold
pattern in genomes of Mycoplasmas, fold population and functional/structural
relationship of yeast proteins, and structural relatedness of proteins
from other organisms.
We
achieved good preliminary results in ab initio prediction of protein
structure. Statistical analysis of all known protein structures allowed
us to derive a new potential of contact energy for pairs of amino acid
residues. Using this potential and secondary structure predictions from
torsion angle dynamics, we were able to predict contacts between helices
in small globular proteins, and we built low-resolution protein structures
for 28 out of 36 small helical proteins based only on sequence information
the best results that have been achieved by any method.
Significance
Structural
characterization of proteins can help to understand protein function,
especially when protein sequence does not show any significant similarities
to proteins of known function. The computational aspect of structural
genomics is important because it (1) directs experimental efforts to potential
targets, (2) reduces the time for solving protein structures, and (3)
predicts fold/structure for proteins whose structure is difficult to determine
experimentally (due to low expression, solubility, etc.).
Publications
S.-H.
Kim, "Structural genomics of microbes: An objective," Curr. Opin. Struct.
Biol. 10, 380 (2000).
C.
Zhang and S.-H. Kim, "Environment-dependent residue contact energies for
proteins," Proc. Natl. Acad. Sci. USA, 97, 2550 (2000).
A. Zhang and
S.-H. Kim, "The anatomy of protein -sheet
topology" J. Mol. Biol. 299, 1075 (2000).
|