Annual Report
2000
TABLE OF CONTENTS YEAR IN REVIEW SCIENCE HIGHLIGHTS
SCIENCE HIGHLIGHTS:
BIOLOGICAL and ENVIRONMENTAL RESEARCH

Computational Structural Genomics

 
Director's
Perspective
 
----------------
YEAR IN REVIEW
----------------
Computational Science
BOOMERANG Data, Analyzed at NERSC, Reveals Flat Universe
Systems and Service
IBM SP Launched Ahead of Schedule with Million-Hour Bonus for Users
Research and Development
Amazing Algorithm Pulls Digits Out of
ACTS Toolkit Provides Solutions to Common Computational Problems
Grid Applications Win SC2000 Competition
Deb Agarwal Named One of "Top 25 Women of the Web"
----------------
SCIENCE HIGHLIGHTS
----------------
Basic Energy Sciences
Biological and Environmental Research
Fusion Energy Sciences
High Energy and Nuclear Physics
Advanced Scientific Computing Research and Other Projects

Using sensitive fold recognition methods (PCM and DuP) allows us to assign folds for more proteins encoded in the complete genome of yeast, S. cerevisiae.

Research Objectives
Our objective is to use computation to obtain the basic structural set for the organisms with completely sequenced genomes. We identify and classify protein folds in complete genomes to find new targets for structural analysis, predict protein structures when possible, and use computational tools for solving protein structures from X-ray data.

Computational Approach
We have developed two different methods, Proximity Correlation Method (PCM) and Dual Profile method (DuP), for detecting similarity of protein folds by comparison of protein sequences. In these methods we combine secondary structure predictions with correlation of physical (in PCM) or evolutionary (in DuP) properties of amino acid residues. Segments of sequences rather than single residues are compared, which substantially increases sensitivity in detecting remote homologues, i.e., proteins with similar folds and low sequence similarities that cannot be detected by standard sequence comparison techniques. For ab initio structural predictions, we derived new energy potentials for contacts between amino acid residues in protein structures and developed a procedure for efficient structure calculation by torsion angle dynamics.

Accomplishments
Pilot projects for protein fold prediction and comparison have been conducted for a few complete genomes. Detailed analysis of protein fold topology allowed us to extend a library of protein fold templates and increase the number of predictions in complete genomes. Results of the pilot projects revealed the folds of several hypothetical proteins in the Methanococcus jannaschii genome, clusters of proteins with same the fold or fold pattern in genomes of Mycoplasmas, fold population and functional/structural relationship of yeast proteins, and structural relatedness of proteins from other organisms.

We achieved good preliminary results in ab initio prediction of protein structure. Statistical analysis of all known protein structures allowed us to derive a new potential of contact energy for pairs of amino acid residues. Using this potential and secondary structure predictions from torsion angle dynamics, we were able to predict contacts between helices in small globular proteins, and we built low-resolution protein structures for 28 out of 36 small helical proteins based only on sequence information — the best results that have been achieved by any method.

Significance
Structural characterization of proteins can help to understand protein function, especially when protein sequence does not show any significant similarities to proteins of known function. The computational aspect of structural genomics is important because it (1) directs experimental efforts to potential targets, (2) reduces the time for solving protein structures, and (3) predicts fold/structure for proteins whose structure is difficult to determine experimentally (due to low expression, solubility, etc.).

Publications
S.-H. Kim, "Structural genomics of microbes: An objective," Curr. Opin. Struct. Biol. 10, 380 (2000).

C. Zhang and S.-H. Kim, "Environment-dependent residue contact energies for proteins," Proc. Natl. Acad. Sci. USA, 97, 2550 (2000).

A.     Zhang and S.-H. Kim, "The anatomy of protein -sheet topology" J. Mol. Biol. 299, 1075 (2000).

< Table of Contents Top ^
Next >