NERSCPowering Scientific Discovery Since 1974

Outage Log

This table lists events reported, ongoing, and resolved on NERSC systems. It is a historical record and may not be updated while a system event is in progress.

Event Date/TimeUp Date/TimeSystemComment
01/17/17 12:06 PST01/17/17 12:40 PSTGlobal HomesUnavailable. Logins are unavailable during this time.
01/17/17 12:06 PST01/17/17 12:40 PSTGlobal CommonUnavailable. Engineers are working to resolve this issue.
01/17/17 12:06 PST01/17/17 13:07 PSTCoriUnavailable. System unavailable due to a filesystem issue. Engineers are working to resolve the problem ASAP.
01/17/17 12:06 PST01/17/17 14:00 PSTGenepoolUnavailable. System unavailable due to a filesystem issue. Engineers are working to resolve the problem ASAP.
01/17/17 12:06 PST01/17/17 13:10 PSTPDSFUnavailable. System unavailable due to a filesystem issue. Engineers are working to resolve the problem ASAP.
01/17/17 12:01 PST01/17/17 13:07 PSTNERSC CenterUnavailable. Center wide outage due to a problem with the filesystems. Engineers are working to bring systems back up ASAP.
01/17/17 10:00 PST01/17/17 10:45 PSTMongoDB ServicesScheduled maintenance. Services on mongodb01 and mongodb02 will be unavailable or at-risk for upgrade of MongoDB version to 3.2. Services on mongodb03 and mongodb04 are available.
01/17/17 8:00 PST01/17/17 20:00 PSTEdisonScheduled maintenance. Edison is currently unavailable while engineers perform scheduled maintenance.
01/17/17 8:00 PST01/17/17 8:42 PSTNIMScheduled maintenance. The NIM interface will be inaccessible during this time.
01/14/17 17:48 PST-Coriknl mode changes are temporarily disabled
01/14/17 13:40 PST01/14/17 22:30 PSTCoriUnavailable. Cori is currently unavailable. Engineers are actively working to resolve this issue as soon as possible.
01/13/17 16:45 PST01/13/17 17:15 PSTProjectB/projectb unavailable. /projectb is experiencing some intermittent problems. Engineers are currently investigating the issue.
01/13/17 11:00 PST01/13/17 11:41 PSTEdisonSystem in degraded mode. /scratch3 filesystem is having intermittent issues. Engineers are actively working on the problem.
01/12/17 8:00 PST01/12/17 18:14 PSTEdisonScheduled maintenance. /scratch1 will be upgraded and not accessible during this time.
01/11/17 15:38 PST01/11/17 16:57 PSTCoriSystem in degraded mode. No KNL jobs can be submitted nor started. Existing running jobs OK. Engineers investigating. Haswell job submission, start, and running are working properly at this time.
01/11/17 9:03 PST01/11/17 15:15 PSTNX ServicesScheduled maintenance. NX unavailable for maintenance and upgrades. All connected sessions will be terminated.
01/10/17 14:45 PST01/10/17 14:48 PSTCoriSystem in degraded mode. Slurm partitions are currently down. Engineers are currently working on the system.
01/10/17 13:01 PST01/10/17 13:23 PSTEdisonUnavailable. Intermittent slurm unavailability owing to Allocation Year turnover activities.
01/10/17 12:00 PST01/10/17 13:30 PSTEdisonSystem in degraded mode. Intermittent slurm unavailability owing to Allocation Year turnover activities
01/10/17 12:00 PST01/10/17 13:00 PSTCoriSystem in degraded mode. Intermittent slurm unavailability owing to Allocation Year turnover activities
01/10/17 8:30 PST01/10/17 8:55 PSTCoriUnavailable. The batch scheduler was down for a brief moment. The system is up and running now.
01/10/17 7:30 PST01/10/17 9:00 PSTCoriSystem in degraded mode. Intermittent slurm unavailability owing to Allocation Year turnover activities
01/09/17 23:30 PST01/10/17 3:00 PSTNIMScheduled maintenance. NIM will be unavailable at 11:30pm 01/09/2017 until about 1:00pm 01/10/2017 for the Allocation Year rollover.
01/09/17 8:35 PST01/09/17 16:50 PSTCoriUnavailable. System is currently experience issues. Engineers are aware and actively working on the problem.
01/04/17 13:55 PST01/04/17 14:23 PSTHPSS UserUnavailable.
01/04/17 9:00 PST01/04/17 10:02 PSTHPSS UserUnavailable.
01/04/17 8:57 PST01/04/17 14:30 PSTHPSS BackupScheduled maintenance. Scheduled maintenance for backup.
12/28/16 9:00 PST12/28/16 13:20 PSTHPSS UserScheduled maintenance.
12/24/16 6:21 PST12/24/16 10:12 PSTNERSC WebsiteUnavailable. Engineers are working to restore service.
12/20/16 8:00 PST12/20/16 18:36 PSTCoriScheduled maintenance. The Cori system including the login, storage and compute nodes will not be available during the maintenance.
12/20/16 8:00 PST12/20/16 18:36 PSTEdisonScheduled maintenance. Cori scratch file system will not be available on the login or compute nodes when Cori is under maintenance. Jobs that require cscratch1 license will not run during this time.
12/17/16 23:11 PST12/18/16 8:35 PSTEdisonUnavailable. Edison compute nodes are currently unavailable. Cray engineers are working on the issue.
12/17/16 9:11 PST12/17/16 9:12 PSTCoriSystem in degraded mode. Datawarp is available but in a degraded state. Engineers are working to resolve this issue.
12/17/16 2:06 PST12/17/16 9:11 PSTCoriSystem in degraded mode. Datawarp is currently unavailable. Engineers are actively working to resolve this issue.
12/16/16 9:15 PST12/16/16 9:45 PSTEdisonLogins unavailable, batch jobs running. New logins to Edison are currently hanging due to filesystem issues. Jobs using /scratch2 may also be impacted. Engineers are actively working to resolve this issue.
12/15/16 7:21 PST12/16/16 4:17 PSTCoriSystem in degraded mode. Engineers working on repairing.
12/14/16 22:54 PST12/15/16 0:50 PSTCoriUnavailable. Batch services down. Currently running jobs will continue. No new jobs can be submitted, no jobs will start. Engineers investigating.
12/14/16 9:50 PST12/14/16 12:55 PSTHPSS UserScheduled maintenance.
12/13/16 16:10 PST12/14/16 1:28 PSTCoriSystem in degraded mode. Cori is in a degraded state. Engineers are investigating the issue.
12/13/16 16:00 PST12/14/16 16:50 PSTCoriSystem in degraded mode. Datawarp is currently unavailable. Engineers are actively working to resolve this issue.
12/13/16 13:12 PST12/13/16 13:53 PSTCoriSystem in degraded mode. No new jobs will start.
12/09/16 16:49 PST12/09/16 22:10 PSTScience Gateway ServicesUnavailable. SGN01 unavailable - Portals on portals.nersc.gov are currently down. Investigation ongoing. This includes qcd.nersc.gov, sgn01, spot.nersc.gov, unwise.nersc.gov, cxidb.org
12/09/16 9:40 PST12/09/16 10:32 PSTCoriUnavailable. Engineers are actively investigating filesystem problems which are causing logins and various operations to hang.
12/09/16 9:40 PST12/09/16 10:32 PSTEdisonSystem in degraded mode. The "cscratch1" filesystem is currently unavailable.
12/08/16 7:40 PST12/08/16 15:00 PSTCoriUnavailable. Cori is unavailable while engineers are working to resolve an issue.
12/05/16 14:00 PST12/05/16 14:38 PSTNX ServicesScheduled maintenance. The NX server will be restarted to implement needed configuration changes. All connected sessions will be terminated.
12/02/16 17:32 PST12/02/16 19:22 PSTCoriSystem in degraded mode. High speed network services disrupted. No new jobs will start.
11/30/16 19:26 PST11/30/16 21:18 PSTCoriSystem in degraded mode. Cori is experiencing network failures. Engineers are investigating the issue. Job scheduling has been paused, however new jobs can be submitted.
11/28/16 9:00 PST11/28/16 16:54 PSTMongoDB ServicesScheduled maintenance. Services on mongodb0* will be unavailable or at-risk for an upgrade of the MongoDB version
11/27/16 18:42 PST11/27/16 22:07 PSTCoriSystem in degraded mode. /cscratch1 is currently unavailable. Engineers are investigating the issue.
11/27/16 18:42 PST11/27/16 22:07 PSTEdisonSystem in degraded mode. /cscratch1 is currently unavailable. Engineers are investigating the issue.
11/22/16 4:25 PST11/22/16 9:53 PSTHPSS BackupSystem in degraded mode.
11/21/16 15:50 PST11/21/16 17:28 PSTPDSFSystem in degraded mode. PDSF jobs have been paused. Network issues causing /global filesystem to not be available. Engineers are currently investigating the issue.
11/21/16 15:39 PST11/21/16 17:36 PSTGenepoolUnavailable. Genepool jobs have been paused. We are investigating.
11/21/16 15:39 PST11/21/16 17:15 PSTCoriSystem in degraded mode. Network issue causing /global filesystem issues. Slurm partitions are currently marked down. Engineers are investigating.
11/21/16 15:39 PST11/21/16 17:25 PSTEdisonSystem in degraded mode. Network issues causing /global filesystem to be unavailable. Slurm partitions currently marked down. Engineers is currently working on the issue.
11/21/16 15:30 PST11/21/16 16:40 PSTNetworkUnavailable. Network issue causing /global filesystems to be unavailable on the systems. Engineers are aware and are working on the issue.
11/21/16 15:30 PST11/21/16 17:27 PSTScience GatewaysSystem in degraded mode. Network issue causing /global filesystems to be unavailable on the systems. Engineers are aware and are working on the issue.
11/21/16 10:32 PST11/21/16 14:05 PSTEdisonSystem in degraded mode. Several compute nodes are missing /global/cscratch1. Nodes will be drained and rebooted. Engineers are currently working on this issue.
11/17/16 9:00 PST11/17/16 11:30 PSTscidb2.nersc.gov (Science Database)Unavailable. Emergency maintenance.
11/17/16 7:00 PST11/18/16 17:05 PSTCori Scratch FilesystemScheduled maintenance. Jobs that require a CSCRATCH license will not run during this time due to filesystem maintenance

Show Older Outages: