NERSCPowering Scientific Discovery Since 1974

Move to CRT

Beginning in Fall 2015, NERSC is in the process of moving from the Oakland Scientific Facility (OSF) in downtown Oakland to a brand new building:  the Computational Research and Theory (CRT) facility, located on the main Lawrence Berkeley National Laboratory campus. 

Impact on NERSC users

NERSC is making every effort to minimize the move's impact on its users. We do not expect to shut down the center completely at any point during the move and NERSC will always have at least one computational system available. However, less computing resources were available while Edison was being moved.  There will also be times when the I/O bandwidth to various global file systems will be reduced for extended periods. 

The move is being performed in phases spread over several months. During the move there will be extended periods where some resources are at OSF, and others are at CRT. Communication between the two sites will be provided by a single 400 Gb/s network connection, and a backup 100Gb/s connection.. Thus, if you are running on a compute platform located at OSF, but performing I/O on a file system located at CRT, I/O bandwidth may be reduced.

Summary

Compute Systems

Edison began its move to CRT on Nov 30, 2015 and was brought back online on January 4, 2016. Carver and Hopper were retired at OSF on September 30 and December 15, 2015, respectively. 

There will be no outage for Genepool and PDSF.

Storage Systems

Global scratch was retired on Oct 14, 2015. All other global file systems (homes, common, project, projectb, dna, seqfs) are being moved to CRT in phases starting in early November. Communication between the two sites is provided by a primary 400Gb/s network connection, and a backup 100Gb/s connection. 

Each global file system is being migrated over the high speed network to CRT. The migration process is an I/O intensive activity, and will consume a sizable fraction of available file system bandwidth. During the migration, you may notice a reduction in file system performance. After the migration completes, some compute systems at OSF will have decreased file system bandwidth. We do not expect any outage of the global file systems during this process.

Both HPSS systems (user archive and system backup) will remain at OSF until other moves complete, accessed via the 400 Gb/s network link.

Key Dates

Event Date
Carver Retires September 30, 2015 (at noon).  Retired
Jesup Testbed Retires September 30, 2015 (at noon).  Retired
Global Scratch Retires October 14, 2015 (at noon).  Retired 
Cori Phase 1 becomes available to all users All users enabled on Nov. 11, 2015
Edison offline for move to new facility Started on November 30, 2015. Edison returned to service on January 4, 2016.
Hopper retires December 15, 2015, at noon. Retired

 

Detailed Schedule

September 2015

Cori Phase 1 and associated file systems were installed at CRT.

Carver was retired on Sept 30, 2015 at noon. On the same day (and same time), the Global Scratch file system became read-only and the Jesup testbed system was retired.

October 2015

Global Scratch was retired on Oct 14, 2015 at noon.

Cori Phase 1 became available to early users.

November 2015

Cori Phase 1 available to all users on Nov 11, 2015.

The Global Projecta file system was migrated to CRT (Nov 13-16). During the migration, available bandwidth was reduced. After the migration,  IO performance from systems at OSF accessing /global/projecta at CRT may be reduced. There was no outage of /global/projecta during this process. 

The Global Project file system was migrated to CRT (Nov 30 to Dec 6). During the migration, available bandwidth was reduced. After the migration, IO performance from systems at OSF accessing /global/project at CRT may be reduced. There was no outage of /global/project during this process.

Edison was powered off at 7:00am PST on Nov 30, 2015.   Edison scratch file systems were reformatted and all data removed. ALL files on the /scratch1, /scratch2, and /scratch3 file systems have been deleted.

Edison queues were turned off and all running jobs were killed at 00:01 PST (midnight) on November 30, 2015. All queued jobs were deleted. Edison login nodes were available until 7am PST on Nov 30 for users to retrieve files. 

December 2015

Edison began the move to CRT; relocation expected to result in up to 6 weeks of downtime.

Hopper retired on December 15 at noon.

Global homes were migrated to CRT on December 15-21.

January 2016

Edison returned to service in its new home in CRT, was back online on January 4 with SLURM as the batch scheduler.

/global/projectb and /global/dna/ (JGI file systems) were migrated to CRT between January 11 to 16. /global/seqfs (JGI file system) was migrated to CRT from January 27 to February 3. During the migration, available bandwidth was reduced. After migration completed, IO performance from systems at OSF accessing these file systems at CRT was reduced. There was no outage of these file systems during this process.

Replacement Genepool interactive nodes at CRT were provided to users on January 26.

February 2016

A cluster providing resources to PDSF, JGI, the Materials Project and Babbage were moved to CRT and reconfigured between February 8 to February 29. New compute components for PDSF and JGI were already in place at CRT, so this was not an outage on the systems. However, the total amount of available compute resources were reduced during the move.

68 PDSF computes off line during the move were brought back to production on February 29.

Genepool's Phoebe test system was retired on February 25th.

The Materials Project had a outage of 3 weeks during the move, and all nodes were back online on February 29. Babbage had an outage of 2 months, and was back online on April 7.

March 2016

Genepool's legacy compute nodes were retired on March 8th. The final 3 remaining legacy Genepool interactive nodes were retired on March 11.

JGI Oracle Database servers and storage were relocated from OSF to CRT on March 17. The systems affected are: gpodb07.nersc.gov, gpodb08.nersc.gov, gpodb11.nersc.gov, gpodb13.nersc.gov.

Science Gateway nodes were moved from OSF to CRT on March 24.

The migration of JGI web services and databases from OSF to CRT was completed on March 31.

Summer 2016

Cori Phase 2 will be delivered and installed at CRT.

Fall 2016

Cori Phase 2 will be available to all users.