Seaborg Status

From: Jim Craw (craw_at_nersc.gov)
Date: 11/02/2001


Hello SP users:

I wish to apologize to our Seaborg users for the recent instability problems.  
Especially over the last few weeks.  As of last Monday NERSC elevated the GPFS 
problems to a "CRITS IT" level (IBM talk for customer site goes Critical).   
As of Wednesday, we have installed fixes for several known problems (e.g. 
random nodes crashing due to GPFS error, GPFS terminating on interactive 
nodes, GPFS limits, etc...).

The only remaining related problem is regarding executables that get left in 
an unusable state when GPFS restarts after getting "terminated".  We have not 
experienced this problem since (fixes were put on system) Wednesday and IBM 
has now identified the problem and is in the process of generating and testing 
a fix.  We hope to get the fix early next week and test it out on our 
development system before planning to install it on Seaborg.

So in the meantime, the system has definitely stabilized, both H/W and S/W 
wise.  The load has picked up but turnaround still looks good.  Please go 
ahead and submit more jobs if you like/can.

Sorry for any inconvenience caused.  Regards,

Jim Craw
Computational Systems Group Lead

This archive was generated by hypermail 2.1.6 : 08/21/2008 PDT