NERSCPowering Scientific Discovery for 50 Years

Presentations

Summary of ERSUG Meeting

January 12 - 13, 1995, Richland, Washington

The latest Energy Research Supercomputer Users Group (ERSUG) meeting was held at Pacific Northwest Laboratory in Richland, Washington, on January 12 - 13, 1995. Some of the talks are summarized below.

The View from Washington (Tom Kitchens)

Tom Kitchens presented the view from Washington. He reminded ERSUG that since the November election Congressional committees responsible for U.S. science are chaired and populated by new people. The mood is to strongly reduce Federal spending even by those who believe science and technology are important to the U.S. economic life. The Energy Research (ER) budget for FY1996 is expected to be, on average, reduced by 15-20%.

Tom discussed the formation of the Distributed Computing Coordination Committee (DCCC) under the auspices of the ESnet Steering Committee. He stressed the importance of this issue, forecasting that most cycles used by ER principal investigators would be done in a Distributed Computing Environment within a very few years. The DCCC effort needs the input and guidance from ER users, and ERSUG is the only organization that represents them. He warned that if ERSUG was not proactive here, ERSUG users would have to accept what others decide to provide for them.

Unified Production Environment (Mike McCoy)

The Unified Production Environment (UPE), which we seek to complete by late 1996, will focus on organizing traditional services into an integrated unit. Just as a WWW browser presents an intuitive interface through which a user can display information independent of its location, NERSC will present a service interface through which NERSC users request computing services independent of which machines provide the services. After logging in, the researcher sees a shared file system regardless of which service is used. At a later stage of the development of the unified environment, the user will have the tools required to use the resources through distributed batch and interactive computing. We view this as a unification of centralized services.

Ultimately, NERSC seeks to integrate the remote user's local environment with the centralized NERSC environment. This could be viewed as a unification of distributed services. An emphasis on service unification is essential because individual components of the service structure--in particular the soon-to-arrive, high-end massively parallel processing (MPP) computational platforms--will require this environment in order to be used to fullest potential.

In determining its major goals, NERSC must take into account two variables: the first is the technological change occurring in the world, and the second is the users' reactions to this change. The dominant components of technological change come from (1) the microprocessor revolution, (2) the adoption of UNIX and other standardized protocols (such as X) worldwide, (3) the exponential increase in wide area network (WAN) and local area network (LAN) bandwidths, and (4) the huge gains in tertiary storage capabilities.

The impact of the microprocessor has already been felt at NERSC. Used as a high-end workstation in the form of the Supercomputing Auxiliary Service (SAS), it helped the Center to regain some of the functionality lost through the adoption of UNIX on the supercomputer. SAS provides a rich set of tools and pre- and post-processing capabilities. This union between microprocessor and supercomputer represents an initial step to a multicomponent UPE.

Of even greater importance to NERSC is that some of the offspring of the microcomputers have largely surpassed the vector supercomputer in capability. These have evolved into three distinct species: MPP, the symmetric multiprocessor (SMP), and the workstation (or PC) cluster featuring a high-performance asynchronous transfer mode (ATM) interconnect. The latter is an interesting, but still unproven, technology in the research stage.

Four Basic Goals of the UPE

We have identified four primary goals of the UPE:

 

  • NERSC must continue to serve the needs of high-end computing. A centrally located MPP within a UPE offering adequate support infrastructure remains a central goal. Regardless of the improvement in desktop or cluster technology, no local user environment, no matter how rich, can hope to offer peak computational speeds on the order of 300 to 600 gigaflops or storage capabilities on the order of 200 terabytes by 1996.

     

  • NERSC must continue to offer the additional services required by the high-end user with limited local resources, including the following: (1) code development, computing, and assimilation capabilities, (2) archival storage, and (3) information services, consulting, and in-depth collaborations. Information services, consulting, and collaborative ventures with NERSC staff are becoming more important because of the complexity in the new programming and assimilation environments. Without expertise, the capability latent in the hardware will not be realized. An SMP will be installed as part of the development and assimilation environment.

     

  • NERSC must facilitate the users' progression through all phases of computational research projects by carefully integrating the services described above. We must make the revolutionary changes to be faced feel evolutionary. The MPP is not currently an optimal platform for code development, debugging, and short development runs. Ancillary services must be easily accessible and must be offered on the appropriate platform.

     

  • NERSC must integrate the remote user's local environment with the centralized NERSC services. This unification of distributed services forms the other part of the UPE. For users with a rich local environment, the Center must act as the high-end complement to the local environment. For users with a very basic environment, the Center must seek to provide all the services necessary to compute effectively at the high end.

While the basic mission of NERSC remains essentially unchanged, the new technologies demand that the services offered evolve to become more tightly coupled, sophisticated, and global. The UPE will coordinate the services seen by the user across the NERSC infrastructure. In the later stages of its implementation, the UPE will extend beyond the centralized environment as we seek to unify the user-local environment with the NERSC centralized environment.

System Administration (Moe Jette)

From a systems administration standpoint, the goal of the UPE is to provide our clients with the ability to exploit the diverse computational and storage resources of NERSC with the ease of a single computer system. The key to this goal is the integration of services, not only within a single computer, but spanning all of NERSC's computers and extending to the desktop environment of our customers.

Some aspects of this environment are described below to give users a sense of how the UPE will affect their work:

 

  • The burden of logging into individual computers will be replaced with a single Kerberos-based authentication to all NERSC services.

     

  • Explicit file transfers will be largely replaced by a global file system incorporating the Andrew File System (AFS), Distributed File Service (DFS), and NERSC's archive.

     

  • Performing work on an individual computer will be replaced by global batch and interactive computing based upon the Network Queuing Environment (NQE), Portable Batch System (PBS), and Load Sharing Facility (LSF). This distributed computing will match the proper resource to the task: Most interactive processing will be off-loaded from the supercomputers, and large batch jobs will automatically be scheduled on the least heavily loaded supercomputer.

     

  • Licensed software with floating licenses will be available for use on the various NERSC computers and at our customers' sites.

SMPs--Where Do They Fit? What Do They Do? (Brent Gorda)

NERSC has undertaken a study of SMPs in response to the recent surge in their popularity. We hope to understand how these systems complement our current and future hardware environment. The raw processing power of SMPs is considerable and can be used to tackle problems that do not make good use of the CRAY Y-MP C90 today. For example, running scalar codes, pre- and post-processing of applications, and visualization tasks are uses that seem well suited to SMPs.

SMPs also have an interesting architecture that may lend itself very well to the UPE envisioned at NERSC. The hardware architecture is close enough to that of the CRAYs to enable running and debugging micro/macrotasking applications. The basic microprocessor and node architecture, and the operating system, are in many ways similar to that of MPPs. Thus, writing, porting, and debugging parallel applications on the SMPs may be easier than on the current MPPs, given the MPP's lack of ability to timeshare.

At the ERSUG meeting we heard strong support for the NERSC study, and we are currently running one SMP machine for our study. Many NERSC users are active in this process, and to date we can report that people are pleased with the scalar performance and interactivity of the machine while loaded. So, early indications are good. We hope to report more in the near future.

Preparation for the Massively Parallel System (Tammy Welcome)

NERSC's first massively parallel (MP) computer system is expected to arrive in the latter part of 1995. NERSC is providing several ways for users to prepare for MP computing.

The NERSC staff will continue to collaborate with research scientists to develop parallel versions of serial applications. Activities include analyzing existing serial applications and converting them to parallel, developing new algorithms better suited to parallelization, tuning parallel applications to minimize communication overhead and maximize single-processor performance, developing distributed applications that permit the interactive program control and/or real-time visualization of data for a parallel program, facilitating sharing of ideas between different research groups working on parallel code development, and developing tools to help users in the transition to parallel computing. Research scientists interested in collaborating with NERSC staff in the development of a parallel application or in need of tools to help in the conversion effort should send e-mail to consult@nersc.gov.

Another way that research scientists can get started is via the MPP Access Program. During Round 1 of the program, nine proposals were awarded allocations on four parallel platforms, including the T3D, CM-5, Paragon, and KSR1. During Round 2 of the program, 21 proposals were awarded allocations on the T3D. Details about the projects awarded time in Round 2 will appear in a future Buffer article. The next round of proposals will be due in the latter part of August for allocations starting October 1, 1995.

Finally, on June 14 - 30, NERSC will host the Summer Workshop on Massively Parallel Processing. The first three days of the workshop will include general tutorials and invited lectures on parallel computing. The second week will focus on advanced topics and hands-on experience with an MP computer system. During the third week, NERSC will provide computing facilities and consulting support for participants to work on their own parallel scientific computing projects.

Follow-up on Throughput on the CRAY Y-MP C90 (Bruce Griffing)

During the previous ERSUG meeting, some sites voiced concern that their NQS (Network Queuing System) jobs were not progressing through NQS in a timely manner. NERSC agreed to analyze Pacific Northwest Laboratory's (PNL's) throughput and report back to ERSUG. A team consisting of Bruce Curtis, Bruce Griffing, Moe Jette, Bruce Kelly, Alan Riddle, and Clark Streeter was formed to do the analysis.

The team identified performance metrics and gathered the data that would allow an analysis to be done. The key values were velocity (CPU time/wall-clock time once job begins execution), wait time (time between submission and initial execution), and held time (time between when a job is checkpointed because the allocation was depleted and when a new allocation is infused this affects velocity). After spending several months collecting data, we learned that it is extremely difficult to reconstruct the C90 environment at any moment using the information logged by UNICOS and NQS. We made many iterations refining NQSTAT because of the variations in the data we encountered. The result of that process, however, was a much-improved tool that NERSC customers can run.

Some of the observations and conclusions presented at the January ERSUG meeting were that the NQS data is very noisy, and post-analysis is extremely labor intensive. We saw that NQS was performing consistently with how it was tuned to perform, but in the class of large jobs, PNL's velocities were, in fact, lower than those of other comparable jobs. At that time our analysis was not complete. Since then we learned that the problems seem to be confined to the Gaussian and Crystal codes. Bruce Curtis was able to make some optimizations that resulted in significant improvements for the specific test cases we had. This information was communicated to PNL.

Storage (Steve Louis)

Storage is a key element in NERSC's UPE. The File Storage Systems Group helps provide solutions for the use of new storage hardware and storage services that cannot be economically duplicated at local sites (e.g., multi-petabyte capacity, data rates of hundreds of megabytes per second, and continuous 24-hours-a-day operation). NERSC's strategic storage goals include high-quality service, scalable facilities, support for heterogeneous client environments, support for large data management systems, and flexible administrative policies for charging, quotas, and load balancing. Viewing archival storage as part of a user's local or shared file system is also a key UPE goal.

A recent milestone is the acquisition of a new Base Storage System modeled after proven National Storage Laboratory (NSL) technologies. This system includes a high-performance parallel interface (HiPPI)- and fiber distributed data interface (FDDI)-connected high-performance RS/6000 server with a 100-GB SCSI-2 fast/wide disk cache, a high-capacity IBM 3494 robotic system with support for new 3590 high-performance tape, and a commercial version of NSL-UniTree storage software. The NERSC Base Storage System is expected to be operational in early May 1995 and is similar to the storage system recently installed at Livermore Computing's Facility for Advanced Scalable Computing Technology (FAST).

Planned FY95 upgrades for the Base System include a new 95-GB HiPPI disk array and new high-capacity, high-speed tape drives (10-25 GB per cartridge uncompressed, and up to 20-MB-per-second data transfers for compressible data). Plans for acquiring a large Fully Configured Storage System (FCSS) in FY96 are also under way. The FCSS is planned as a large upgrade or augmentation to the Base System and is expected to completely replace the current CFS (Common File System) and IBM 4381 storage environment. The FCSS will include new disk and archival subsystems, and run a commercial high-performance, scalable storage management software product such as the NSL's High-Performance Storage System (HPSS).

User Services and Information Systems (Jean Shuler)

NERSC has a reputation for providing excellent service to our customers. Our reputation is built on a foundation of many years of experience in listening and responding to customers' needs. The User Services Group's traditional role has been to provide technical consulting services and to act as a user advocate. We collaborate with NERSC scientists and researchers, develop and provide technical training, and ensure software quality assurance. A primary responsibility is to develop and provide current, accurate information through browsing and searching software.

These services will continue to be important, but there are new focus areas which have been introduced to keep NERSC on the leading edge of advances in technology. In this age of information and technological revolution, changes continually occur. NERSC is addressing these changes in several ways. We are working on providing a single interface to all information delivery systems. We are developing training techniques for utilizing new media, and providing technical expertise for collaboration and coordination with NERSC scientists and researchers.

The single-user-interface information delivery system is based on the Standard Generalized Markup Language (SGML) standard. The goal of this system is to deliver the information anywhere, anytime, using the appropriate presentation media. A WWW interface is the immediate vehicle by which this information will be integrated and presented. Some of the information and delivery systems being integrated over the diverse hardware and software platforms are CrayDoc, MAN pages, the NERSC documentation database, the REMEDY trouble ticket system database, bulletin boards, vendor help packages, and other WWW databases. We plan to present training classes using video-on-demand technology, video teleconferencing, and asynchronous access to the classes. The asynchronous access to stored classes can be presented either by file transfer and/or real-time playback by connecting to one of the prototype high-speed networks in the San Francisco Bay Area, via integrated services digital network (ISDN) or LAN connection to the Internet, or via modem connection over standard telephone lines.

Providing technical expertise for collaboration and coordination with NERSC scientists and research ers is critical. The User Services Group and other NERSC staff are working with researchers in such areas as parallel programming, code optimization, distributed computing, and visualization of scientific data. These efforts benefit both NERSC and the ER programs. We will continue to provide our traditional services and to build on new ones to satisfy customer needs and to promote their science.

Distributed Computing (Roy Whitney)

Roy Whitney presented information on the Distributed Computing Coordinating Committee (DCCC) and ESnet Site Coordinating Committee (ESCC) mis sion and goals. He discussed interactions of the DCCC and ESCC with other ESnet community groups such as ERSUG. The initiation of the Distributed Informatics, Computing, and Collaborative Environment (DICCE) project was also discussed. Here is an overview of the latest task forces and working groups comprising the DCCC, ESCC, and DICCE project efforts.

ESCC Task Forces and Working Groups

E-mail Task Force (EMTF)--Chair: Dave Osterman, osterman1@llnl.gov. Develops strategies for integrating WAN e-mail with LAN e-mail systems. The goal is to create an "interoperable" e-mail system including the ability to send attachments or enclosures. EMTF is developing strategies for deploying a secure or privacy-enhanced version of e-mail, i.e., Pretty Good Privacy (PGP) and Privacy Enhanced Mail (PEM).

ESnet Decnet Working Group (EDWG)--Chair: Phil DeMar, demar@fnal.gov. Coordinates DECnet functionality in the ESnet community in support of ESnet operations. Negotiates DECnet issues with the global DECnet community. EDWG is responsible for the migration of DECnet Phase IV to DECnet Phase V.

IPng Working Group--Chair: Bob Fink, rlfink@lbl.gov. Develops strategies and pilots for the implementation of the next generation of IP networking both for local sites and in concert with ESnet Management for the ESnet community.

Network Monitoring Task Force (NMTF)--Chair: Les Cottrell, cottrell@slac.stanford.edu. Acts as a focus group/forum for ESnet community sites in the area of network monitoring. Shares network monitoring information among participants (plans, experiences with hardware/tools/ applications, requirements, threshold metrics, and performance objectives).

Local Asynchronous Transfer Mode TaskForce (LATM)--Chair: Bob McMahon, mcmahon@anl.gov. Has successfully run a pilot project on implementing ATM in the LAN environment. Is now coordinating LAN ATM information within the ESnet community as ATM is further deployed.

Remote Conferencing Working Group (RCWG)--Chair: Kipp Kippenhan, kippenhan@fnal.gov. Advances collaborative video conferencing for both conference rooms and the desktop. Both ISDN and Internet multicast backbone (Mbone) style technologies are used. Furthermore, they are being developed to be interoperable. The RCWG works with the ESnet Video Conferencing Service at NERSC to coordinate video activities in the ESnet community.

Joint ESCC and DCCC Task Forces

Authentication Task Force (AUTHTF)--Chair: Doug Engert, deengert@anl.gov. AUTHTF is setting up Kerberos V5 Authentication ESnet community wide. In particular, AUTHTF is identifying, understanding, and resolving the technical, procedural, and policy issues surrounding peer-to-peer authentication in an inter-organization Internet. It is coordinating this activity with the Key Distribution Task Force. As Open Software Foundation's Distributed Computing Environment (OSF/DCE) authentication services become functional, the AUTHTF will work with the Distributed Computing Environment Working Group to implement authentication in these environments.

Key Distribution Task Force (KDTF)--Chair: Bill Johnston, johnston@george.lbl.gov. Coordinates issues related to the deployment of secure keys to be used by e-mail technologies such as PEM and PGP, and authentication services such as digital signature. If appropriate, recommends strategies for such deployment. The KDTF is charged to ensure that their efforts are compatible with those of the Internet Engineering Task Force (IETF) and Federal interagency key distribution task forces.

DCCC Task Forces and Working Groups

Andrew File System/Distributed File System Task Force (ADFSTF)--Chair: Troy Thompson, tk_thompson@pnl.gov. Coordinates plans for the implementation of DFS in a WAN environment and for the migration of existing ESnet AFS to DFS. The group may choose to implement a DFS pilot project.

Application Working Group (AWG)--Chair: Dick Kouzes, rt_kouzes@gate.pnl.gov. Develops strategies, tools, and pilot projects for collaboration in areas such as: (1) National Information Infrastructure (NII) focused projects; (2) information services, including data storage and retrieval, project documentation, and multimedia lab notebook and calendar; (3) distributed collaboration tools, including multimedia communications and software development; (4) collaboration on social organization issues, including effective standard operating procedures.

Architecture Task Force (ATF)--Chair: Arthurine Breckenridge, arbreck@sandia.gov. ATF is recommending a high-level architecture for a distributed collaboration environment that will eventually provide production-level support of research efforts in DOE. The architecture is being developed to complement, and possibly to help define, the DOE NII activities. It will also address the non-technical (social, political, and budgetary) issues to facilitate the establishment of such an environment.

Distributed Computing Environment Working Group (DCEWG)--Chair: Barry Howard, howard@nersc.gov. Examines and identifies the recommended appropriate elements of a distributed computing environment, including such components as OSF/DCE, the Common Desktop Environment (CDE), the Common Object Request Broker Architecture (CORBA), and Load Sharing. Responsible for recommending strategies and pilots for implementing these components.

Distributed System Management Working Group (DSMWG)--Chair: John Volmer, volmer@anl.gov. Develops strategies, tools, and pilot projects for effectively providing systems management to distributed heterogeneous systems. Will also interact with the DCEWG for the effective systems management of DCEWG layer tools.

Group Communications Working Group (GCWG)--Chair: Allen Sturtevant, aps@es.net. Develops interoperable communications methods and strategies for documents and other forms of group communications exchange, including FTP, Gopher, and WWW servers, MIME e-mail extensions, graphics formats, and database exchanges. The GCWG's objective is to support the ability to have seamless communications between individuals and groups across heterogeneous platforms and information environments.