NERSCPowering Scientific Discovery Since 1974

NERSC Achieves Breakthrough 93% Utilization on Cray T3E

April 19, 1999

The NERSC Computational Systems Group last week completed the final acceptance tests for the Cray T3E, completing an almost two-year effort to meet all conditions of the original purchase agreement. Key to completing the tests was the successful implementation of SGI's "psched" scheduling daemon. With all the features of psched running, and with NQS and "prime job" control scripts written by Computational Systems Group staff, the T3E has posted utilization figures of more than 93 percent, a level usually associated with vector machines, said group lead Jim Craw. The system has exceeded 90 percent availability overall for the past two weeks.

"This is an unprecedented level of utilization of a massively parallel machine in a general purpose computing environment," Jim said. "These impressive results are due, in part, to a lot of hard work by Mike Welcome, Brent Draney and Tina Butler of the Systems Group and by Steve Luzmoor and Bryan Hardy of SGI."

Efficiently scheduling a large MPP system is difficult. On the T3E, parallel applications are required to run on logically consecutive processors. Also, in the past, only one application could run on a range of processing elements in order to insure synchronous scheduling. As applications entered and exited the system, the range of available processing elements would fragment, creating many small groups of unused processors. After a while, only small jobs could enter the system.

The psched load balancing feature automatically migrates running applications to collapse small holes and thereby create large regions of available processors to run larger jobs. The psched gang scheduler will allow more than one application to run on a range of processors and will synchronously schedule them so that one application will run for awhile with complete control of the processors while the other is suspended. After a "time slice" is up, the applications switch rolls and the suspended application gets to run. Another new feature of psched is the ability to designate a job with "prime" status. A prime job will preempt any other (non-prime) application and will be given preferred launching status to get it in and running as soon as possible.

Another recent change, requested by NERSC, is the ability to limit the amount of interactive work on the system at any point in time. In the past, this was achieved by designating a collection of processing elements to batch-only work, and a second region to interactive work. If there was no interactive work on the system at the time, the processors would idle. Now that all processors are available to run batch and interactive work, the staff schedules the entire system with batch work and allows interactive users to run with prime status. Interactive work is limited to 132 processors at a time.

Finally, Mike Declerck and Mike Welcome of NERSC and Steve Luzmoor of SGI have developed a sophisticated set of PERL scripts to manage the batch system and control prime jobs. The scripts control queue and job activation such that Grand Challenge work runs in the evening, large jobs at night, and smaller jobs during the day. The scripts use the checkpoint/restart feature to halt one job mix and start the next.

Mike Welcome notes that the overall project was the result of an aggressive testing and problem reporting effort carried out with the help of on-site SGI/Cray analysts. In addition, forty UNICOS/MK kernels have been booted on the system as we continue to test software and bring new services into production. The resulting success, Mike adds, will benefit both NERSC and other centers with T3Es. "We're running a large MPP system with a dynamic and diverse job mix, both batch and interactive, at greater than 90 percent utilization rate," Mike says. "NERSC and SGI have a lot to be proud of."


About NERSC and Berkeley Lab
The National Energy Research Scientific Computing Center (NERSC) is a U.S. Department of Energy Office of Science User Facility that serves as the primary high-performance computing center for scientific research sponsored by the Office of Science. Located at Lawrence Berkeley National Laboratory, the NERSC Center serves more than 7,000 scientists at national laboratories and universities researching a wide range of problems in combustion, climate modeling, fusion energy, materials science, physics, chemistry, computational biology, and other disciplines. Berkeley Lab is a DOE national laboratory located in Berkeley, California. It conducts unclassified scientific research and is managed by the University of California for the U.S. Department of Energy. »Learn more about computing sciences at Berkeley Lab.