 |
 |
 |
 |
| This
3D supernova simulation shows the turbulent environment
beneath the supernova shock wave. See details |
|
|
The NERSC Program is part of the Computing Sciences organization
at Berkeley Lab and works closely with two other departments within
Computing Sciences: the High Performance Computing Research Department
and the Distributed Systems Department. These two departments conduct
a large number of independently funded research and development
efforts in applied mathematics, computer science, and computational
science. Some of their staff members also work on tasks matrixed
from the NERSC Program, such as the advanced development of scientific
computing infrastructure, and focused high-end support for NERSC
clients in areas such as algorithms, software tools, and visualization
of data.
This close association of research activities and a leading-edge
computing facility is mutually beneficialit gives NERSC users
access to the latest technologies and tools, while encouraging developers
to address the critical needs of computational scientists. Some
of the highlights of this year's R&D efforts are described in
this section, particularly those that are relevant to SciDAC.
Applied Mathematics
Applied mathematics research at Berkeley Lab ranges from involvement
in three SciDAC projects, which are expected to yield major scientific
benefits within a few years, to investigating the randomness of
certain mathematical constants, which represents a major step toward
answering an age-old question.
APPLIED PARTIAL DIFFERENTIAL
EQUATIONS
Led by Phil Colella, head of the Applied Numerical Algorithms Group
(ANAG), the Applied Partial Differential Equations (PDE) Integrated
Software Infrastructure Center (ISIC) will develop a high-performance
algorithmic and software framework for solving PDEs arising from
three important mission areas in the DOE Office of Science: magnetic
fusion, accelerator design, and combustion. This framework will
provide investigators in these areas with a new set of simulation
capabilities based on locally structured grid methods, including
adaptive meshes for problems with multiple length scales; embedded
boundary and overset grid methods for complex geometries; efficient
and accurate methods for particle and hybrid particle/mesh simulations;
and high performance implementations on distributed-memory multiprocessors.
One of the key results of this effort will be a common mathematical
and software framework for multiple applications.
Members of ANAG and the Center for Computational Sciences and Engineering
(CCSE), led by John Bell, have more than 15 years of experience
in developing adaptive mesh refinement (AMR) algorithms and software,
culminating last year in the release of Berkeley Lab AMR, a comprehensive
library of AMR software and documentation. This experience is the
foundation of their leadership role in the Applied PDE ISIC, which
includes collaborators from Lawrence Livermore National Laboratory,
the University of California at Davis, New York University, the
University of North Carolina, the University of Washington, and
the University of Wisconsin.
The research of CCSE and ANAG to date has focused primarily on
turbulent combustion processes, and their methods have matured to
the point that several of their recent simulations have accurately
reproduced experimental results. Because small-scale turbulent fluctuations
modify the physical processes such as kinetics and multiphase behavior,
an important goal of their research is to develop techniques that
accurately reflect the role of small-scale fluctuations on the overall
macroscopic dynamics. They are also working on improved techniques
for visualizing AMR data (see Figure 8).
| |
 |
|
| Figure
8. CCSE and the Berkeley Lab/NERSC Visualization Group collaborated
on this simulation of shock-wave physics that shows what happens
to a bubble of argon when subjected to a shock wave. This sequence
shows images from early and late simulation time steps, with
and without the underlying AMR grid. It was produced with Visapult,
our research prototype application and framework that performs
image-based-rendering-assisted volume rendering of large, 3D
and time-varying AMR datasets. |
| |
TOPS AND ACCELERATORS
Esmond Ng, leader of the Scientific Computing Group, is a collaborator
on two SciDAC projects: the Terascale Optimal PDE Solvers (TOPS)
ISIC, led by David Keyes of Old Dominion University, and the Advanced
Computing for 21st Century Accelerator Science and Technology project,
led by Kwok Ko of Stanford Linear Accelerator Center and Robert
Ryne of Berkeley Lab.
The TOPS ISIC will research, develop, and deploy an integrated
toolkit of open source, optimal complexity solvers for the nonlinear
PDEs that arise in many Office of Science application areas, including
fusion energy, accelerator design, global climate change, and reactive
chemistry. These algorithms, primarily multilevel methods, aim to
reduce computational bottlenecks by one to three orders of magnitude
on terascale computers, enabling scientific simulation on a scale
heretofore impossible.
The 21st Century Accelerator Project will develop a new generation
of accelerator simulation codes, which will help to use existing
accelerators more efficiently and will strongly impact the design,
technology, and cost of future accelerators. These simulations use
a wide variety of mathematical methods; for example, the electromagnetic
systems simulation component utilizes sparse linear solvers for
eigenmode codes.
Esmond Ng was one of the first researchers to develop and implement
efficient algorithms for sparse matrix computation on parallel computer
architectures, and some of his algorithms have been incorporated
into several scientific computing libraries. Several other Berkeley
Lab and NERSC staff members also have eigenanalysis and sparse linear
systems expertise which the SciDAC projects will be able to take
advantage of. In related research during the past year, the MUMPS
general-purpose sparse solver was tuned, analyzed, and compared
with the SuperLU code developed by Sherry Li and James Demmel.
ARE THE DIGITS OF PI
RANDOM?
David Bailey, NERSC's chief technologist, and his colleague Richard
Crandall, director of the Center for Advanced Computation at Reed
College, Portland, Oregon, have taken a major step toward answering
the age-old question of whether the digits of pi and other mathematical
constants are truly random. Their results were reported in the Summer
2001 issue of Experimental Mathematics.
Numbers like pi have long been thought to be "normal,"
meaning that in base 10, for example, any single digit occurs one-tenth
of the time. While the evidence to date supports this assumption,
no naturally occurring math constantsuch as pi, the square
root of 2, or the natural logarithm of 2has ever been formally
proved to be normal in any number base.
Bailey and Crandall have translated this heretofore unapproachable
problem to a more tractable question in the field of chaotic processes.
They propose that the normality of certain constants is a consequence
of a plausible conjecture in the field of chaotic dynamics, which
states that sequences of a particular kind are uniformly distributed
between 0 and 1a conjecture they refer to as "Hypothesis
A." If even one particular instance of Hypothesis A could be
established, the normality of important mathematical constants would
follow.
Computer Science
Computer science research and development at Berkeley Lab runs
the gamut from programming languages and systems software to scientific
data management, Grid middleware, and performance evaluation of
high-end systems. The expertise of our computer scientists and the
relevance of their research can be seen in the projects highlighted
below
DOE SCIENCE GRID COLLABORATORY
Led by Bill Johnston, head of the Distributed Systems Department,
the DOE Science Grid SciDAC Collaboratory will define, integrate,
deploy, support, evaluate, refine, and develop the persistent Grid
services needed for a scalable, robust, high-performance DOE Science
Grid. It will create the underpinnings of the software environment
that the SciDAC applications need to enable innovative approaches
to scientific computing through secure remote access to online facilities,
distance collaboration, shared petabyte datasets, and large-scale
distributed computation.
The DOE Science Grid will provide uniform access to a wide range
of DOE resources, as well as standard services for security, resource
access, system monitoring, and so on. This will enable DOE scientists
and their collaborators in projects such as the Particle Physics
Data Grid (PPDG), the Extensible Computational Chemistry Environment
(ECCE), the Earth Systems Grid (ESG), and the Supernova Factory
Collaboratory to much more readily employ computational and information
resources at widely distributed institutions. It will also facilitate
development and use of collaboration tools that speed up research
and allow scientists to tackle more complex problems. All of these
services will be available through secure Web/desktop interfaces
in order to produce a highly usable environment.
Bill Johnston and the Distributed Systems Department staff have
more than a decade of R&D experience in this field, in addition
to Bill's experience as project technical manager for NASA's Information
Power Grid. NERSC Deputy Director Bill Kramer is co-principal investigator
on this SciDAC project; other collaborators are at Argonne National
Laboratory, Pacific Northwest National Laboratory, and Oak Ridge
National Laboratory. Several other projects of the Distributed Systems
Department are described below.
DEVELOPING GRID TECHNOLOGIES
The Distributed Systems Department this year formed a new Grid
Technologies Group, with Keith Jackson as group leader, to research
and develop technologies needed for the DOE Science Grid. Their
current focus is on developing high-level tools to make the Grid
easier to use and program. The group is developing Commodity Grid
Kits (CoG Kits), which allow one to utilize basic Grid services
through commodity technologies such as frameworks, environments,
and languages, to allow easier development of Grid applications.
(Some examples of these technologies are CORBA, Java, Perl, and
Python.)
 |
|
 |
|
| The
Grid Technologies Groupstudent intern Wesley Lau, group
leader Keith Jackson, and staff members Jason Novotny and Joshua
Boverhofare working to make Grid middleware more user
friendly so that scientists can more easily create their own
Grid connections and working environments. |
| |
The group's preliminary work in this field includes the Grid Portal
Development Kit, which provides common components used to construct
portals allowing secure access to Grid resources via an easy-to-use
Web interface; and pyGlobus, an interface to the Globus toolkit
from Python (an interactive, object-oriented scripting language).
The group is also developing versions of the industry standard Simple
Object Access Protocol (SOAP) that use the Grid Security Infrastructure
(GSI) to provide authentication and delegation. Building on this
foundation, the group is developing a more comprehensive CoG Kit
for designing science application Web portals and problem-solving
environments.
NAVIGATING NETWORK
TRAFFIC
Today's computer operating systems come configured to transfer
network data at only one speedusually slowregardless
of the underlying network. To take advantage of high-speed networks
like ESnet, the Net100 Project is creating software that allows
computer operating systems to tune themselves and adjust dynamically
to changing network conditions. Net100 is a collaboration of the
Pittsburgh Supercomputing Center, the National Center for Atmospheric
Research, Berkeley Lab, and Oak Ridge National Laboratory, with
Brian Tierney and the Distributed Data Intensive Computing Group
leading Berkeley Lab's effort. The network sensing components of
Net100 will be based on NetLogger and other tools developed here,
and our main contribution will be the Network Tools Analysis Framework.
Dealing with network traffic problems from the perspective of Grid
applications is the Self-Configuring Network Monitoring Project,
led by Brian Tierney and Deb Agarwal. For a distributed application
to fully utilize the network, it must first know the current network
properties and what is happening to its data along the entire network
path, including local and wide-area networks. Without this information,
the end-to-end system is often unable to identify and diagnose problems
within the network. This project is designing and implementing a
self-configuring monitoring system that uses special request packets
to automatically activate monitoring along the network path between
communicating endpoints. This passive monitoring system will integrate
with active monitoring efforts and provide an essential component
in a complete end-to-end network test and monitoring capability.
RELIABILITY
AND SECURITY ON THE GRID
The DOE Science Grid and the availability of distributed resources
enable applications such as shared remote visualization, shared
virtual reality, and collaborative remote control of instruments.
These applications require reliable and secure distributed information
sharing and coordination capabilities, usually provided by collaboration
and security tools that use server-based systems. Unfortunately,
the need to run and support servers often prevents small collaborations
from installing the tools, while the scaling problems of server-based
systems can limit the size of large collaborations. Collaborations
are naturally built in an incremental and ad hoc manner, and this
dynamic and scalable peer-to-peer model is not supported well by
a rigid server-based structure. Two coordinated projects in the
Distributed Systems Department are addressing these problems, one
focusing on the communication issues, the other on security.
The goal of the Reliable and Secure Group Communication Project,
led by Deb Agarwal, is to develop the components necessary for a
peer-to-peer group communication infrastructure that provides reliability,
security, and fault-tolerance while enabling scalability on the
Internet scale. The InterGroup protocols are being used to provide
reliable delivery of messages, ordered delivery of messages, and
membership services, while the Group Security Layer provides the
secure group communication mechanisms. The long-term goal is to
integrate these components, being developed by Berkeley Lab's Collaboration
Technologies Group, into the DOE Science Grid infrastructure.
The Distributed Security Architectures project, led by Mary Thompson,
is working to provide assured, policy-based access control for Grid
systems and services. The foundation of this project is the Distributed
Security Research Group's Akenti certificate-based authorization
system, which provides multiple-stakeholder control over distributed
resources accessed by physically and administratively distributed
users. The Akenti access policy documents are created and maintained
by stakeholders independent of the resource server platform. Current
work focuses on integrating the Akenti authorization mechanism with
emerging standards such as the IETF's Transport Layer Security (TLS),
the Grid Security Interface (GSI), WebDAV protocols, and Generic
Authentication and Authorization interface (GAA). Integrating Akenti
with GSI is being done as part of the SciDAC National Fusion Collaboratory
proposal. A stand-alone Akenti server will also be available on
the DOE Science Grid nodes for Grid applications to use to as an
option for authorization.
MAKING COLLABORATIONS
MORE PRODUCTIVE
Many of the tools currently available for remote collaboration
focus on rigidly structured applications such as videoconferencing.
While these are important when a high level of interaction is needed,
our experience building distributed collaboratories has revealed
a more basic need for less intrusive and more flexible ways for
people to stay in touch and work together on the daily tasks required
by large research efforts. These tasks include not only communications
and document sharing, but also tracking workflow, such as data archiving
and analysis.
The Pervasive Collaborative Computing Environment (PCCE) project,
led by Deb Agarwal and Chuck McParland, is researching, developing,
and integrating the software tools required to support a flexible,
secure, seamless collaboration environment that supports the entire
continuum of interactions between collaborators. This environment
is envisioned as a persistent space that allows participants to
locate each other; use asynchronous and synchronous messaging; share
documents, applications, progress, and results; and coordinate daily
activities.
The PCCE project is leveraging existing and recently proposed tools
such as Grid Web Services, Internet Relay Chat (IRC), Web Distributed
Authoring and Versioning (WebDAV), electronic notebooks, Basic Support
for Cooperative Work (BSCW), and videoconferencing capabilities.
By basing our environment on the DOE Science Grid computing and
data services, we hope to maximize its applicability to a wide range
of collaborative research efforts and present users with a familiar,
consistent, and secure activity management and coordination environment.
The collaborative workflow tools will also help DOE researchers
take full advantage of the flexible computing and storage resources
that will be available on the Science Grid
MANAGING
SCIENTIFIC DATA
Terascale computing and large scientific experiments produce enormous
quantities of data that require effective and efficient management,
a task that can distract scientists from focusing on their core
research. In some fields, data manipulationgetting files from
a tape archive, extracting subsets of data from the files, reformatting
data, getting data from heterogeneous distributed systems, and moving
data over the networkcan take up to 80% of a researcher's
time, leaving only 20% for scientific analysis and discovery. The
goal of the SciDAC Scientific Data Management ISIC, led by Ari Shoshani,
head of the Scientific Data Management Group in the High Performance
Computing Research Department, is to reverse that ratio by making
effective scientific data management software widely available.
This ISIC will provide a coordinated framework for the unification,
development, deployment, and reuse of scientific data management
software. It will target four main areas that are essential to scientific
data management: storage and retrieval of very large datasets, access
optimization of distributed data, data mining and discovery of access
patterns, and access to distributed, heterogeneous data. The result
will be efficient, well-integrated, robust scientific data management
software modules that will provide end-to-end solutions to multiple
scientific applications.
The research and development efforts will be driven by the needs
of the initially targeted application areas: climate simulation,
computational biology, high energy and nuclear physics, and astrophysics.
Several of the teams involved in this project have developed procedures,
tools, and methods addressing scientific data management and data
mining for individual application areas. But this project is the
first attempt to unify and coordinate these efforts across all the
data management technologies relevant to the DOE mission and across
all the SciDAC scientific applications. Collaborators include Argonne
National Laboratory, Lawrence Livermore National Laboratory, Oak
Ridge National Laboratory, Georgia Institute of Technology, North
Carolina State University, Northwestern University, and the University
of California, San Diego.
Ari Shoshani and his group have been pioneers in developing a comprehensive
approach to scientific data management, and they are also collaborating
on two other SciDAC data management projects: the Particle Physics
Data Grid Collaborative Pilot, and the Earth Systems Grid II: Turning
Climate Datasets into Community Resources.
CHECKPOINT/RESTART
FOR LINUX
Members of the Future Technologies Group, drawing on their previous
experience with the Linux kernel, are developing a hybrid kernel/user
implementation of checkpoint/restart. Their goal is to provide a
robust, production-quality implementation that checkpoints a wide
range of applications, without requiring changes to be made to application
code. This work focuses on checkpointing parallel applications that
communicate through Message Passing Interface (MPI), and on compatibility
with the software suite produced by the SciDAC Scalable Systems
Software ISIC.
The Scalable Systems Software ISIC, led by Al Giest of Oak Ridge
National Laboratory, is developing an integrated suite of machine-independent,
scalable systems software for effective management and utilization
of terascale computational resources. The goal is to provide open-source
solutions that work from small to large-scale systems. Berkeley
Lab's contribution, spearheaded by Paul Hargrove, includes the checkpoint/restart
implementation for Linux as well as standard interfaces between
checkpoint/restart and other components in the software suite. Paul
is also heading the process management working group within the
ISIC.
The Future Technologies Group's effort will be the first completely
open checkpoint/restart implementation designed for production supercomputing.
Other production implementations have been developed commercially,
but no information is available on how they work. Other open-source
implementations also exist, but were designed with an emphasis on
research, not on production computing. Our work will deliver not
only the benefits of checkpoint/restart to our users, but also all
of the lessons learned necessary to undertake similar efforts in
the future.
| |
 |
|
| The
Future Technologies Group performs research and development
on infrastructure for scientific computing; their current projects
include checkpoint/restart for Linux, Unified Parallel C, and
performance analysis of high-end systems. Members include (front)
Mike Welcome, Erich Strohmaier, Sonia Sachs, Eric Roman, Kathy
Yelick, (back) Costin Iancu, new group leader Brent Gorda, Paul
Hargrove, Jason Duell, (not shown) David Culler, James Demmel,
Lenny Oliker, Evan Welbourne, and Richard Wolski. |
| |
IMPLEMENTING UNIFIED
PARALLEL C
Shared memory programming models are more attractive to many users
than the message passing programming model. The ability to read
and write remote memory with simple assignment statements is much
easier than writing code using all the conventions of a message-passing
library. However, in order to write efficient code for large-scale
parallel machines, programmers need a language that allows them
to exploit data locality on a variety of memory architectures. Unified
Parallel C (UCP) is exactly such a language.
UPC is an extension of the C programming language designed for
high-performance computing. UPC uses a Single Program Multiple Data
(SPMD) model of computation, in which the amount of parallelism
is fixed at program startup time, typically with a single thread
of execution per processor. The communication model
is based on the idea of a shared, partitioned address space, where
variables may be directly read and written by multiple processors,
but each variable is physically associated with a single processor.
The language provides a uniform programming model for shared memory
and distributed memory hardware, with some of the programmability
advantages of shared memory and the control over data layout and
performance of message passing.
The goal of the Future Technologies Group's UPC effort is to build
portable, high-performance implementations of UPC for large-scale
multiprocessors, PC clusters, and clusters of shared memory multiprocessors.
There are three major components to this effort: (1) developing
a runtime layer for UPC that allows for lightweight communication
calls using the most efficient mechanism available on the underlying
hardware, (2) optimizing the UPC compiler, and (3) developing a
suite of benchmarks and applications to demonstrate the features
of the UPC language and compilers, especially targeting problems
with irregular computation and communication patterns. The project
is being led by Kathy Yelick, a joint member of the Future Technologies
Group and professor of computer science at the University of California,
Berkeley.
HIGH-END COMPUTER
SYSTEM PERFORMANCE
The SciDAC Performance Evaluation Research Center (PERC), under
the leadership of NERSC's Chief Technologist, David Bailey, will
focus on how one can best execute a specific application on a given
platform. The research results from this effort are expected to
permit the generation of realistic bounds on achievable performance,
and to answer three fundamental questions: (1) why do these limits
exist; (2) how can we accelerate applications toward these limits;
and (3) how can this information drive the design of future applications
and high-performance computing systems.
PERC will develop a science for understanding the performance
of scientific applications on high-end computer systems, and engineering
strategies for improving performance on these systems. The goals
of the project are to optimize and simplify the profiling of real
applications, measurement of machine capabilities, performance prediction,
performance monitoring, and informed tuning. Studying the convoluted
interactions of application signatures and machine signatures will
provide the knowledge necessary to achieve those goals.
In addition to his own significant contributions to the field of
benchmarking and performance analysis, David will have a wealth
of experience to draw on from other Berkeley Lab and NERSC staff,
including Horst Simon, Bill Kramer, Erich Strohmaier, Adrian Wong,
Lenny Oliker, and others. Other SciDAC participants include Argonne
National Laboratory, Lawrence Livermore National Laboratory, Oak
Ridge National Laboratory, the University of Illinois, the University
of Maryland, the University of Tennessee, and the University of
California, San Diego.
Computational
Science
Berkeley Lab staff work closely with scientists in a variety of
fields to develop and improve software for simulation and data analysis,
with the ultimate goal of making computational science more productive.
Some recent examples are discussed below.
BABAR
DETECTS CLEAR CP VIOLATION
| |
 |
 |
|
| Simon
Patton, Akbar Mokhtarani, and Igor Gaponenko upgraded the database
for the BaBar detector, helping researchers sort through millions
of subatomic events to find clues to the asymmetry of matter
and antimatter. |
| |
Why is there more matter than antimatter in the Universe? One plausible
explanation is CP violation occurring in the first seconds after
the Big Bang. CP violation means violation of the combined conservation
laws associated with charge conjugation (C) and parity (P) by the
weak nuclear force, which is responsible for reactions such as the
decay of atomic nuclei. The existence of CP violation was experimentally
demonstrated decades ago, but there are conflicting theories to
explain it.
The Asymmetric B Factory and BaBar detector at the Stanford Linear
Accelerator Center were built to provide new data to help solve
the matter/antimatter puzzle. On July 6, 2001, after analyzing data
from 32 million pairs of B mesons, the international BaBar Collaboration
announced that BaBar had found 640 pairs that exhibited unmistakable
differences in the ways that the matter and antimatter B mesons
decayedclear evidence of CP violation in agreement with the
Kobayashi-Maskawa model, one of the two leading theories.
Part of the software infrastructure for BaBar data analysis was
recently upgraded by the HENP Computing Group, which develops software
for large, international high energy and nuclear physics experiments.
Specifically, Simon Patton, Akbar Mokhtarani, and Igor Gaponenko
completed a major upgrade of the BaBar database that allows data
processing to scale to multiple petabytes of data, with further
improvements feasible in the future. They also discovered ways of
storing data more efficiently and improved the parallel accessibility
and reliability of the database. These upgrades will help accommodate
larger datasets resulting from improved accelerator luminosity and
changing physics goals.
QUANTUM
RODS EMIT POLARIZED LIGHT
A collaboration between experimental and computational scientists
at the University of California and Berkeley Lab has made a significant
discovery in nanoscience. In the June 15, 2001 issue of Science,
the research team reported that colloidal quantum rods of cadmium
selenide (CdSe) exhibit linearly polarized emission, which may make
them useful as light emitters in a wide range of nanotechnology
applications, such as biological labeling, flat panel displays,
and lasers.
The article "Linearly Polarized Emission from Colloidal Semiconductor
Quantum Rods" (Science 292, 2060) was written by Jiangtao
Hu, Liang-shi Li, Weidong Yang, Libero Manna, and A. Paul Alivisatos
(all of the Berkeley Lab Materials Science Division and the UC Berkeley
Chemistry Department), and Lin-wang Wang of the Scientific Computing
Group. The computation was done on NERSC's Cray T3E with the Escan
code developed by Lin-wang, which can calculate million-atom systems
using the folded spectrum method for non-self-consistent nanoscale
calculations. The calculation showed that the photoluminescence
of the CdSe quantum dot changes direction from non-polarized to
linearly polarized after the shape changes from spherical to rod-like,
at the aspect ratio of 2. This result was confirmed by experimental
measurements.
This discovery showed that optical emission properties of quantum
dots can be tailored by adjusting the height, width, and shape of
the potential that confines electrons and holes. The technological
significance is that colloidal rods can be produced by comparatively
simple solution methods and are photochemically robust, making them
good candidates for a variety of light emission applications.
NEW
PARALLEL ELECTRONIC STRUCTURE CODE
Andrew Canning of the Scientific Computing Group gave a presentation
on a new parallel electronic structure code, P-FLAPW, at the International
Conference for Computational Physics in September 2001. Andrew developed
the parallel code in collaboration with Wolfgang Mannstadt of Marburg
University and Arthur Freeman's group at Northwestern University.
FLAPW (for full-potential linearized augmented plane-wave) is one
of the most accurate and widely used methods for determining structural,
electronic, and magnetic properties of crystals and surfaces. Until
the work by this group, the method was limited in scope because
it did not have a parallelized version, so it could only be applied
to small systems. Now with the parallel code P-FLAPW, it is possible
to perform calculations on systems of hundreds of atoms, which means
technologically important systems such as nanostrutures, impurities,
and disorded systems can be studied with this highly accurate first-principles
method. Use of the parallel eigensolvers from the ScaLAPACK library
allows the P-FLAPW code to scale up efficiently to hundreds of processors,
which is a computational requirement for the study of large systems.
ScaLAPACK is one of the many computational tools that form the Department
of Energy's ACTS Toolkit (see below).
NEUTRINO
DATA FROM THE SOUTH POLE
Jodi Lamoureux of the Scientific Computing Group took a business
trip to the South Pole this past year as part of her work collaborating
on the software infrastructure for the AMANDA project (Antarctic
Muon and Neutrino Detector Array). AMANDA is a neutrino observatory
that searches for high-energy neutrinos from cosmic sources to verify
that active-galactic nuclei and gamma-ray bursters are proton accelerators.
Jodi's usual routine at home includes analyzing AMANDA data with
algorithms and visualization tools that she helped develop for data
filtering and reconstruction, but she flew to Antarctica during
the local summer to work on AMANDA data handling. In addition to
taking detector calibration measurements, she also helped with satellite
transfers and organized various processes that select data samples
for monitoring and quick analysis.
Initial results validating the AMANDA technology, computed at NERSC,
were published in the March 22, 2001 issue of Nature as "Observation
of High-Energy Neutrinos Using Cerenkov Detectors Embedded Deep
in Antarctic Ice" by E. Andrés et al. (Nature
410, 441).
IMPROVING
CLIMATE MODEL PERFORMANCE
A multi-institutional team has been collaborating to merge two
of the world's most advanced computer climate models, the Climate
System Model (CSM) and the Parallel Climate Model (PCM). The merged
Community Climate System Model (CCSM) is being designed to include
the best features of both models and to perform well on a variety
of computer architectures. Chris Ding, Helen Yun He, and Woo-Sun
Yang of the Scientific Computing Group have been working to optimize
parallel input/output and to optimize performance of the coupler
(the top-level model that integrates the component models) on distributed
memory architectures.
A major contribution of Chris's team this year was development
of the Multi Program-Components Handshaking Utility (MPH). Many
large and complex scientific applications are based on semi-independent
program components developed by different groups or for different
purposes (in this case, CSM and PCM). MPH handles the initial component
handshaking and registration process necessary for combining codes
on a distributed memory architecture. MPH supports two software
integration mechanismsmulti-component multi-executable, and
multi-component single-executable, with processor overlapping or
non-overlappingas well as a modular approach in which each
component builds its own executable. With this utility, one can
change execution modes relatively easily without extensive rewriting
of codes. Although developed for CCSM, this flexible component coupling
system could be used by a wide range of applications.
The climate team also has optimized the three most time-consuming
subroutines in the PCM coupler: the flux conservation, the ocean-to-atmosphere
regridding, and the atmosphere-to-ocean regridding. These optimizations
improved the total coupler timing by 20% on 64 processors, and they
have been adopted in production codes. Other team activities have
included assessing various I/O and file systems, studying methods
to increase climate simulation reproducibility, and improving the
performance of finite difference methods. All of these efforts will
contribute to the SciDAC project "Collaborative Design and
Development of the Community Climate System Model for Terascale
Computers."
ACTS
TOOLKIT EXPLICATED
The ACTS Toolkit is a set of DOE-developed tools that make it easier
to write parallel scientific programs. The ACTS Online Information
and Support Center (http://acts.nersc.gov/),
operated by Osni Marques and Tony Drummond of the Scientific Computing
Group, is a centralized source of information about these tools.
But not content to sit back and respond to inquiries, Tony and Osni
have taken a proactive role in promoting the ACTS tools.
In October 2001 they organized a three-and-a-half-day workshop
at Berkeley Lab, "Tools for Advanced Computational Testing
and SimulationSolving Problems in Science and Engineering,"
aimed at familiarizing researchers in various scientific disciplines
with the ACTS tools. The workshop included a range of tutorials
on the tools, discussion sessions focused on solving specific computational
needs of the participants, and hands-on practice using NERSC's computers.
More than 50 presenters and participants took part in the workshop.
As part of the Los Alamos Computer Science Institute's Second Annual
Symposium, also held in October, Osni and Tony organized a full-day
workshop on "High-Performance Numerical Libraries for Science
and Engineering," with Sherry Li also giving a presentation.
Topics included introduction to the tools, panel discussion on the
tools and applications, tool interoperability, panel discussion
on the frameworks and standards for software interoperability, scientific
and engineering applications, and panel discussion on usage and
applicability of commercial and noncommercial software.
VISUALIZATION
GROUP TACKLES AMR DATA
 |
|
 |
|
| John
Shalf, new members Ken Schwartz and Cristina Siegerist, and
group leader Wes Bethel make up the core of the Berkeley Lab/NERSC
Visualization Group. |
| |
During the past year, Wes Bethel accepted the position of Group
Leader for the Berkeley Lab/NERSC Visualization Group, whose mission
is to apply scientific visualization principles and practices to
scientific data in a multidisciplinary setting, and to anticipate,
define, and develop new visualization technologies that are appropriate
for contemporary and future applications. To meet the needs of production
requirements, the Visualization Group installs and maintains a portfolio
of visualization software on NERSC platforms. To meet the evolving
needs of remote users, the Visualization Group has defined a roadmap
for expanding the breadth and depth of services to the remote user
constituency.
An ongoing focus of the group's research is Visapult, an application
and framework for remote and distributed visualization. Visapult
uses parallel computers, a desktop workstation, and a remote data
source that are coupled together into a distributed application
that implements image-based rendering assisted volume rendering.
This application has a unique feature of effectively decoupling
interactivity on the desktop from the delays inherent in network-based
applications.
The Visualization Group has broadened its scope of research activities
to include faculty and staff from the University of California at
Davis's Center for Image Processing and Integrated Computing (CIPIC).
Together, the two groups have focused on methods for direct volume
rendering of adaptive mesh refinement (AMR) data (see figure 8 above).
AMR data visualization poses special challenges, particularly when
the datasets are large and network connections to remote locations
are slow. The group plans to begin to deploy these research prototypes,
as well as Visapult, into a limited production environment using
Web-based portal technology. The portal technology will serve to
simplify user access to the remote and distributed visualization
software components.
|