From: Richard Gerber (ragerber_at_lbl_dot_gov)
Date: 12/21/2007
Dear Bassi users, Holiday Greetings! We hope Bassi continues to be a productive machine for you. The system was recently in uninterrupted service for more than three months before last week's brief system reboot. We recycled the machine on Dec. 13 to clear "stuck" memory segments on many of the nodes. Some memory on the affected nodes was unavailable to your applications. While most codes were probably unaffected, those with extreme memory bandwidth requirements, or those trying to use all available memory, may have had problems. All nodes are back to normal and we are working with IBM to keep this from happening again. This issue was identified by NERSC's performance monitoring efforts, the results of which are available online at https://www.nersc.gov/nusers/systems/bassi/monitor.php. Examine the "memrate" results to see how an extremely memory intensive routine was affected by the recent problems. We have been saving memory "snaphots" of each node every 15 minutes and this data for each job is available from the NERSC completed jobs page at https://www.nersc.gov/nusers/status/jobs/index.php If your job ran for 30 minutes or more you can click on the Job ID (Bassi jobs only) and get a look at how your job used memory on each node (not each MPI task) as a function of time. If you ran using the NERSC IPM performance utility, the memory snapshots will appear on the same page with your IPM results. We hope these web pages automatically provide you with valuable information about your run with little to no effort on your part. If you are unfamiliar with IPM, please see http://www.nersc.gov/nusers/resources/software/tools/ipm.php Finally, if you are moving from Seaborg to Bassi, please review the Quick Start Guide for Seaborg Users at http://www.nersc.gov/nusers/systems/bassi/quick.php Regards, Richard Gerber -- Richard Gerber, Ph.D. ragerber_at_lbl_dot_gov NERSC phone: 510-486-6820 Lawrence Berkeley National Lab fax: 510-486-4316 Berkeley, CA 94720
This archive was generated by hypermail 2.1.6 : 12/21/2007 PST