1998 Annual Report
Computer Science and Applied Mathematics

Center for Bioinformatics and Computational Genomics

The human genome contains about 3 billion base pairs that code for 65,000 to 75,000 genes. Mapping and sequencing the genome is an international effort involving two dozen large centers, including the Department of Energy's Joint Genome Institute. Currently 150 million base pairs are sequenced each year. However, to fully sequence the entire human genome by 2005, the participating centers will need to sequence 2 million per day.

Helping biologists manage and analyze this explosion of data is the goal of NERSC's new Center for Bioinformatics and Computational Genomics (CBCG). Combining the expertise of biologists and computer scientists, CBCG strives to make the tools of both disciplines work together.

The Center's work includes creating specialized software modules, designing databases, and developing standardized methods for indexing genomic information.

Providing an accessible format for data stored at two dozen research centers is crucial to making the information useful for scientists worldwide. The web-based Genome Channel provides a prototype graphic interface using standard annotation for all genome sequences completed to date. The interface allows users to zoom in on a particular chromosome and see how much of it has been sequenced, then access each individual sequence and the accompanying annotation.

The Genome Channel was developed by scientists at Oak Ridge National Laboratory and Berkeley Lab as part of the Distributed Consortium for High-Throughput Analysis and Annotation of Genomes, a DOE Grand Challenge project.

Other projects at CBCG include:

  • the Alternative Splicing Database, which provides information about alternatively spliced genes, their products, and expression patterns
  • SubmitData, a user interface that formats data for automated submission to genome databases
  • BioPOET, a system for large-scale sequence analysis

NERSC's new Center for Bioinformatics and Computational Genomics combines the expertise of biologists and computer scientists to manage and analyze an explosion of data. Shown here are (from left, standing) David Demirjian, Inna Dubchak, Manfred Zorn, Donn Davy, Denise Wolf, Janice Mann, Sylvia Spengler, (kneeling) Igor Dralyuk, and Mischelle Merritt.


  • the Resource for Molecular Cytogenetics, a joint project with the Cancer Genetics Program at the University of California, San Francisco
  • FoldPred, a program that predicts the protein fold classification for any amino acid sequence
  • participation in the University of California Systemwide Life Sciences Informatics Task Force.

By developing partnerships, carrying out research in bioinformatics, and working with the bioinformatics community in areas such as education, training, and standards development, CBCG hopes to become the premier provider of bioinformatics expertise and services both regionally and nationwide.


 INDEX  NEXT >>