NERSC represents the primary supercomputing resource available to the Energy Research scientists, providing them with an unparalleled environment of high-end computing equipment. One of the primary virtues of NERSC is that it provides a unified integrated computing environment that can accommodate the spectrum of supercomputing needs of the ER community, from the Grand-Challenge-scale single jobs to the tremendous volume of medium-to-large size jobs. We feel strongly that this dual role is essential for NERSC, and thus we urge in the strongest terms that the unified integrated computing environment of NERSC be maintained.
NERSC's vision of the Unified Production Environment is an appropriate model. NERSC should be there to provide those things (both capacity and capability) that we do not have available to us locally and which we require in order to do our jobs as an ER researcher. In addition to hardware, this also means software, technical leadership, and consulting. The users urge Washington to support this model. We think this is so important that we urge you to consider changing your present policies, charters, etc., if that is what is necessary for you to support it.
The users' overall concern is for NERSC to provide them with the best computing infrastructure for their scientific needs. This requires that the NERSC budget be maintained or, failing that, protected to the highest possible degree. It also requires that NERSC be given the freedom to spend its funding wisely in support of the users' spectrum of computing needs. Our detailed concerns can be summarized in the following three points.
First, the area that we perceive to be most ignored, additional cost-effective resources should be made available in order to allow use of the C90, and eventually of the MPP, to be dedicated to the most appropriate tasks for these machines. The workstation resources available at the users home sites are insufficient to meet this need.
The computer industry in recent years has turned away from the high-end maximum performance supercomputer. This change stems in part from the decline of support of this arena by Defense Programs, which all ER benefited from and now is feeling the lack of, and in part from a host of industry and economic changes. The biggest advances in recent years have been in lower-than-highest-end, cost-effective machines of increasingly powerful performance. NERSC's plan, to add newly available SMPs as part of its Unified Production Environment seems to us a very viable solution. Replacing the expensive-to-maintain Cray-2s with SMPs should be part of this plan. We urge Washington management to push this plan forward with all possible speed as the best possible way to ease the access to supercomputing at NERSC.
Second, it is essential that NERSC remain on the forefront of high-end, new-capability supercomputing. We believe that this should be achieved by keeping the acquisition of a general purpose Massively Parallel Processor (MPP) acquisition at as high a priority as possible. The NERSC MPP team has done a stellar job in structuring their procurement to get the latest technology that will do the job. We urge that the MPP acquisition by NERSC be kept on track. If the MPP procurement fails there are concerns from some of us that this could lead to the disappearance of the NERSC facility all together.
Third, it is essential that the C90 supercomputing resource be maintained. Pushing the C90 into more new-capability modes such as the Special Parallel Processing (SPP) program is to be encouraged, but this plan requires that the broader base of users at the same time be accommodated by other platforms at NERSC.
Special Parallel Processing (SPP) vs. normal vector use of C90 this last year became a big issue. On the one hand SPP has provided some truly new-capability scientific computing for a small but significant subset of our users. The use of the C90 in SPP mode is competitive with present day MPP machines. On the other hand, the SPP program forces other users off for significant fractions of the total time available. The success of last year's SPP was such that there were active plans by our Washington managers to ask for a significant expansion this year; these plans had to be largely shelved or reduced in scope when NERSC's analysis reported too large a negative effect on the rest of our users. So we are faced with some truly new capability solutions with the existing supercomputer platform, but nowhere to push the rest of traditional vector supercomputer codes for that fraction of time used by the new capability. The SPP use represents a suddenly appearing new jump in demand. A clearer case for the need for additional supercomputing resources is hard to imagine.
Given that SPP has become a successful mode of operation, it is our view that the SPP allocation process should be absorbed into the normal procedures. We urge that this change be expedited.