nersc
Close this window

Email Announcement Archive

[Users] REMINDER: Important Changes at NERSC in New Allocation Year

Author: Rebecca Hartman-Baker <rjhartmanbaker_at_lbl.gov>
Date: 2020-01-09 09:01:31

Dear NERSC Users, I wanted to make you aware that a number of changes are coming to NERSC in the new allocation year, which begins this Tuesday, January 14. For a comprehensive list of changes, please see the Allocation Year Transition <https://www.nersc.gov/users/allocation-year-transition-2019-to-2020/> webpage. First, if you are a project PI or proxy for a continuing project, please be sure to validate your user list in Iris <https://iris.nersc.gov> as soon as possible. If you do not do so, none of your users will continue into the new allocation year, and there could be much confusion for your users, and a delay in getting your project started again. Second, during the transition, we have scheduled a downtime for all systems, beginning at 7:00 am (Pacific time) on Tuesday, January 14. During this downtime, we will upgrade the batch scheduler and the programming environments on Cori, and delete all scavenger jobs, jobs from non-continuing projects, and held jobs older than 12 weeks. In addition, we will mount the Project file system on all nodes in a read-only mode. We have allocated 24 hours for this process, but should the systems come back early, user jobs will begin to run at that time; if they come back exceptionally early, charging for user jobs will not begin until midnight (the very start of Wednesday). Third, there are a number of major changes in AY20. The default programming environment will change, and the biggest impact of this change is that dynamic linking becomes the default. Python 3 will become the default Python module. The charge factors on Cori will change to 140 on Haswell nodes and 80 on KNL nodes. We hope that this does not become relevant to any users until much later in the year, but the QOS formerly known as "scavenger" that permits users with a zero or negative allocation balance to run jobs at extremely low priority will now be called "overrun". Finally, I would like to recap our plans for transitioning from the Project file system to the new Community file system (CFS). When the machines return to service on Wednesday, the Project file system will be mounted as a read-only file system. This is so that we can perform a final sync of the data on the Project file system to the Community file system before making CFS available and retiring Project. We have allocated a week to this sync, but should it complete sooner, we may be able to make CFS available early. When we do make CFS available, there will not be an outage; we will perform rolling reboots of the compute and login nodes. Users will not notice any changes in the compute nodes, except that if you acquired the use of a node before the reboots began, for the duration of your job it will continue to be in the old configuration, with the read-only Project file system mounted. The node will be rebooted into the new image with read/write CFS when your job releases it and before the next job runs on that node. Users of the login nodes may be logged out of the login node so that a reboot into the new image can be performed. If this happens to you, simply log back in: the load balancer will place you on a node with read/write CFS access. Thanks for your patience as we upgrade our services and systems to better serve you in the new allocation year! Regards, -Rebecca -- Rebecca Hartman-Baker, Ph.D User Engagement Group Leader National Energy Research Scientific Computing Center | Berkeley Lab rjhartmanbaker@lbl.gov | phone: (510) 486-4810 fax: (510) 486-6459 Pronouns: she/her/hers _______________________________________________ Users mailing list Users@nersc.gov

Close this window