NERSCPowering Scientific Discovery Since 1974

Outage Log

This table lists events reported, ongoing, and resolved on NERSC systems. It is a historical record and may not be updated while a system event is in progress.

Event Date/TimeUp Date/TimeSystemComment
10/17/18 20:27 PDT10/18/18 8:24 PDTNX ServerUnavailable. NX / NoMachine is unavailable
10/17/18 10:00 PDT10/17/18 13:00 PDTScience DatabasesScheduled maintenance. Services on nerscdb03 and nerscdb04 will be down briefly (5-15 minutes) within the maintenance window for system software updates.
10/17/18 10:00 PDT10/17/18 11:00 PDTSpinScheduled maintenance. Science Gateways and services on Spin that use global file systems will be down while a storage subsystem is reconfigured.
10/17/18 10:00 PDT10/17/18 11:00 PDTScience GatewaysScheduled maintenance. Science Gateways and services on Spin that use global file systems will be down while a storage subsystem is reconfigured.
10/17/18 9:00 PDT10/17/18 10:45 PDTJGI Web and Database ServersScheduled maintenance. Servers will be down briefly (5-15 minutes) within the maintenance window for system software updates and reboots.
10/17/18 7:00 PDT10/17/18 23:55 PDTCoriScheduled maintenance. Cori monthly maintenance. The login nodes and Cori scratch filesystem (cscratch1) will remain available for part of the maintenance window, but no jobs will run. The batch system will be unavailable.
10/10/18 10:23 PDT10/11/18 10:41 PDTHPSS BackupUnavailable. System requires restart to clear mount queue.
10/10/18 2:00 PDT10/10/18 11:04 PDTHPSS UserSystem in degraded mode. HPSS systems are degraded, users may experience delays in retrieving data.
10/09/18 9:30 PDT10/09/18 12:40 PDTMyProxyScheduled maintenance. NERSC quarterly maintenance. Services will be down briefly (5-15 minutes) within the maintenance window for system software updates.
10/09/18 9:30 PDT10/09/18 9:35 PDTNIMScheduled maintenance. NERSC quarterly maintenance. Services will be down briefly (5-15 minutes) within the maintenance window for system software updates.
10/09/18 9:00 PDT10/09/18 16:15 PDTSpinScheduled maintenance. NERSC quarterly maintenance. Services may be briefly interrupted (1-2 minutes) within the maintenance window for container restarts.
10/09/18 8:00 PDT10/09/18 11:09 PDTData Transfer NodesScheduled maintenance.
10/09/18 8:00 PDT10/09/18 12:00 PDTGlobus EndpointsScheduled maintenance. Globus data transfers and NERSC endpoint activation (DTN, HPSS, Edison, Cori, PDSF, DTN-JGI, and shared NERSC endpoints) will not work.
10/02/18 17:42 PDT10/03/18 13:10 PDTHPSS UserUnavailable. System is currently unavailable. Engineers are investigating the issue, but no estimated uptime is available yet.
10/02/18 17:42 PDT10/03/18 13:09 PDTHPSS BackupUnavailable. System is currently unavailable. Engineers are investigating the issue, but no estimated uptime is available yet.
09/26/18 7:00 PDT09/27/18 2:29 PDTEdisonScheduled maintenance. Edison Monthly Maintenance
09/25/18 9:00 PDT09/25/18 19:35 PDTScience DatabasesScheduled maintenance. Postgres databases on nerscdb03 will be upgrade to v10 and unavailable during this period
09/24/18 8:41 PDT09/24/18 9:39 PDTEdisonSystem in degraded mode. Engineers checking into Lustre scratch issues
09/20/18 16:50 PDT10/17/18 7:00 PDTCoriSystem in degraded mode. Degraded state extended. DataWarp/BurstBuffer is currently unavailable. Bug identified, patch has been released, awaiting patch installation during planned Cori maintenance on 10/17.
09/19/18 9:00 PDT09/19/18 13:52 PDTHPSS UserScheduled maintenance.
09/19/18 7:00 PDT09/20/18 0:27 PDTCoriScheduled maintenance.
09/18/18 15:36 PDT09/18/18 17:30 PDTCoriDedicated runs. Large reservation of Cori Haswell nodes. All nodes for JGI and realtime queues, and limited nodes for debug queue, will remain available during this reservation. Interactive nodes will not be available. Note: this is a re-run of the dedicated runs which were interrupted this morning.
09/18/18 9:56 PDT09/18/18 15:36 PDTCoriUnavailable. Cori is currently unavailable, engineers are investigating the issue. Next update will be provided as more information becomes available.
09/18/18 9:00 PDT09/18/18 9:56 PDTCoriDedicated runs. Large reservation of Cori Haswell nodes. All nodes for JGI and realtime queues, and limited nodes for debug queue, will remain available during this reservation. Interactive nodes will not be available. Note: this reserved time was interrupted by an unplanned outage, and has been replaced by a subsequent reservation starting at 15:36 PDT.
09/17/18 6:11 PDT09/17/18 8:15 PDTCoriSystem in degraded mode. Existing jobs continue to run, new jobs are unable to start . Engineers are working to correct a network issue.
09/01/18 8:28 PDT09/01/18 10:00 PDTCoriSystem in degraded mode. cscratch1 perfomance is degraded, engineers investigating.
08/25/18 20:32 PDT08/26/18 3:04 PDTCoriSystem in degraded mode.
08/24/18 0:43 PDT08/24/18 4:28 PDTCoriSystem in degraded mode. The system is degraded, engineers are investigating. Logins are available, however jobs cannot be submitted.
08/21/18 10:46 PDT08/21/18 16:08 PDTPDSFSystem in degraded mode.
08/17/18 8:00 PDT08/21/18 7:12 PDTNERSC CenterScheduled maintenance. The NERSC facility will be conducting power maintenance. All services will be unavailable for the duration of the maintenance window.
08/17/18 8:00 PDT08/21/18 7:12 PDTCoriScheduled maintenance.
08/17/18 8:00 PDT08/17/18 19:26 PDTEdisonScheduled maintenance.
08/17/18 8:00 PDT08/21/18 16:49 PDTGenepoolScheduled maintenance.
08/17/18 8:00 PDT08/20/18 16:13 PDTPDSFScheduled maintenance.
08/17/18 8:00 PDT08/20/18 11:25 PDTDNAScheduled maintenance.
08/17/18 8:00 PDT08/20/18 11:25 PDTGlobal CommonScheduled maintenance.
08/17/18 8:00 PDT08/20/18 11:25 PDTGlobal HomesScheduled maintenance.
08/17/18 8:00 PDT08/20/18 11:25 PDTProjectScheduled maintenance.
08/17/18 8:00 PDT08/20/18 11:25 PDTProjectAScheduled maintenance.
08/17/18 8:00 PDT08/20/18 11:25 PDTProjectBScheduled maintenance.
08/17/18 8:00 PDT08/19/18 21:40 PDTHPSS BackupScheduled maintenance.
08/17/18 8:00 PDT08/19/18 21:40 PDTHPSS UserScheduled maintenance.
08/17/18 8:00 PDT07/20/18 10:30 PDTMongoDBScheduled maintenance. MongoDB services will be upgraded from version 3.2 to version 3.4
08/17/18 8:00 PDT08/20/18 15:51 PDTData Transfer NodesScheduled maintenance.
08/17/18 8:00 PDT08/20/18 11:11 PDTmongodbScheduled maintenance.
08/10/18 14:50 PDT08/10/18 17:00 PDTGenepoolSystem in degraded mode. Genepool Database host gpdb23 was degraded between 2:50PM - 5:00 PM.
08/03/18 16:42 PDT08/03/18 18:55 PDTGenepoolSystem in degraded mode. Queues are stopped to determine stability.
08/03/18 14:20 PDT08/03/18 16:36 PDTCoriSystem in degraded mode. Cori is currently degraded. Existing jobs running, new jobs are paused.
08/01/18 9:00 PDT08/01/18 10:41 PDTHPSS BackupScheduled maintenance. The maintenance has been completed
07/25/18 7:00 PDT07/25/18 20:58 PDTEdisonScheduled maintenance. Maintenance will be extended for additional testing.
07/20/18 18:00 PDT07/21/18 1:50 PDTCoriSystem in degraded mode. Due to significant number of downed nodes, cori has been degraded.
07/18/18 9:00 PDT07/18/18 10:23 PDTHPSS BackupScheduled maintenance.
07/18/18 9:00 PDT07/18/18 12:00 PDTHPSS UserScheduled maintenance.
07/11/18 23:45 PDT07/13/18 16:20 PDTCoriSystem in degraded mode. The system is currently degraded while Cray engineers complete hardware work.
07/11/18 7:00 PDT07/11/18 23:45 PDTCoriScheduled maintenance. The maintenance has been further extended due to ongoing multiple blower issues
07/11/18 7:00 PDT07/11/18 19:00 PDTEdisonSystem in degraded mode. cscratch1 will not be available on Edison due to Cori schedule maintenance.
07/11/18 7:00 PDT07/11/18 19:00 PDTData Transfer NodesSystem in degraded mode. cscratch1 will not be available on DTN systems due to Cori schedule maintenance.
07/08/18 17:20 PDT07/09/18 5:15 PDTScience Database ServicesSystem in degraded mode. Service was slow or unresponsive for several periods ranging from 10 minutes to 2 hours during this time window due to a failing backup process.
07/07/18 11:30 PDT07/07/18 17:40 PDTCoriSystem in degraded mode. Scratch filesystem issues have been repaired. Engineers have returned full system functionality to Cori as of 17:40 PDT.
07/07/18 11:30 PDT07/07/18 17:40 PDTEdisonSystem in degraded mode. Scratch filesystem issues have been repaired. Engineers have returned full system functionality to Edison as of 17:40 PDT.

Show Older Outages: