ERSUG Meeting Summary Notes, November 15, 1999
Here are some highlights from the discussions (excepting the items contributed by ERSUG Chair, Bas Bramms below):
During the state of NERSC presentation by Jim Craw a primary topic of discussion was the issue of the processing capabilities of the PVP cluster. Since the upgrade of the batch system processors to SV1s, some concern has been expressed about the relatively poorer processing capabilities of the J90SE processors on the interactive Killeen system. Naturally everybody would prefer having all the processors on Killeen also upgraded to SV1s. This would both make the system more uniform (upcoming compiler releases are expected to diverge with more optimization in place for the SV1s) and performance would be improved (especially important to 2-3 groups). Unfortunately, the NERSC budget didn't allow upgrading the CPUs on Killeen to SV1s at the time the batch processors were upgraded. Various options for moving processors around were discussed. The complexity of running systems with a mix of CPU types made it unwise to run such mixed systems. Also it was felt best to put the enhanced processing capabilities where the system load is heaviest - namely on the batch systems. In the end (particularly without anyone to carry the banner of the groups who most want improvements in the interactive system) it was the consensus of the ERSUG attendees that the J90 cluster should stay as it is now with the SV1 CPUs on the batch systems (which are typically fully loaded) and the older J90SE processors on the interactive system, Killeen (unless or until more money becomes available for an upgrade). Naturally this decision can be reopened as we gain experience with the new compiler upgrades optimized for the SV1s or if money becomes available for an upgrade.
Keith Fitzgerald discussed the status of the NERSC file storage system, focusing on HPSS. One highlight was the discussion of the "PROBE" collaboration with ORNL for testing aspects of HPSS. For more information, see "PROBE" website.
Bill Kramer discussed the status of NERSC-3. An emphasis in the agreement between NERSC and IBM is the production nature of the NERSC-3 system. Because of this many of the tests that must be passed during the acceptance period and contract period of the NERSC-3 deployment include fairly full production code runs as well as system restarts. At the time of this meeting (November 15, 1999) the NERSC-3 acceptance was suspended while a problem with significant variation in runtimes for codes was being resolved.
NERSC should have a new machine room in operation during the summer of next year (2000). This new building is in Oakland. In current plans it will house phase 2 of the NERSC-3 procurement as well as some of the NERSC clusters.
The first NERSC Users Group (NUG) meeting is expected to be held at ORNL. A meeting time has not yet been set.
Here are a few items that Bas Braams, ERSUG Chair, summarized after this ERSUG meeting:
NERSC is considering creation of a production Linux cluster for its general users to take some of the load from the MPP and/or the PVP platforms. An email working group of NERSC staff and users was formed to discuss options for such a cluster. The NERSC principals are Craig Tull and Bill Kramer, and users that volunteered at the meeting are David Dean and Doug Olson. After the meeting I spoke with Charles Karney, who is responsible for a small Linux cluster at PPPL, and he is also available for this discussion group.
We aim to produce a next installment of the Greenbook (user requirements for hardware and services at NERSC) about a year to 15 months from now. Volunteers that will start the process sometime this Spring are David Dean, Brian Hingerty, Doug Rotman, Robert Ryne and Bas Braams. We'll want to pull the descriptive material (current research that uses NERSC) together over the summer, and to have a tighter schedule than last time for the discussions about the user requirements and recommendations.
The draft charter for the new NERSC user group and its executive (NUG and NUGEX) was adopted with a modification concerning the transition process. A nominating committee was formed consisting of David Dean, Brian Hingerty, Mike Minkoff, Bas Braams and Theresa Windus (the first four had already been doing groundwork before the meeting). This group, working with Horst Simon, will make precise recommendations to the transition NUGEX (the old EXERSUG) about election modalities, the slate of candidates, and about who will stay on from EXERSUG to NUGEX. (The draft charter had it that these decisions would all be made at the past ERSUG meeting.) We anticipate getting the elections underway in December.
Sandy Merola provided the URL for the DOE directives home page: http://www.directives.doe.gov.