Happy New Year - First Blog Entry
January 2, 2015 by Richard Gerber
Happy New Year to all!
Some users have asked for NERSC staff blogs on current happenings and events at NERSC, so here is a first attempt! No promises that it will be comprehensive or always timely, but I'll do my best.
This holiday break was largely uneventful, which is the way we like it. The systems have been up and stable, and lots of you have been able to get a lot of computing done. It's a good thing that the systems have been behaving because the backlog on both Edison and Hopper is huge: 25 days on Edison and 10 on Hopper. That means that even if no new jobs were submitted, Edison would be completely full through the start of the new allocation year on Jan. 13, 2015. It also means that it takes a long time for newly submitted jobs to start running.
There's not much we can do about the wait times, but we have a few new tools you can use to monitor the situation and perhaps help you plan your job submissions. If you go to MyNERSC and choose "Queues->Backlogs" from the left-hand navigation menu, you'll find a plot of the backlogs on the machines over time. The data just started being collected in mid-December, so it will be more useful as times goes on. This will give both you and us an easy graphical way to monitor demand throughout the year and make informed and transparent choices if policy decisions need to be made. If you choose, for example, Edison's premium queue, you can see a rise in demand as the allocation year nears its end and some repos have time to burn. Since we don't want those who didn't use their time early to have an advantage, the premium queues will be disabled at 8:00 PST on Monday, Jan. 5. One interesting thing to observe on these plots: The black line shows the backlog for only "eliligible" jobs. These are jobs that are actively being considered by the scheduler and could theoretically start any time resources become available. The blue line shows all jobs, even those in the queue that are not eligible to start because users have reached their limit of eligible jobs submitted to a queue.
The other new page at MyNERSC is under "Center Status->Usage Summary." There you can see aggregate usage plotted each day of the allocation year. By default, you see the sum of all repos, but you can enter yours (or any one for that matter) and see usage throughout the year and see what it would be if the repo's allocation were used at a constant pace. (There still might be bug that shows an incorrect end date for repositories that started mid-year. Don't be fooled, you time expires on Jan. 12, 2015 at midnight just like everyone else's, except for ALCC awards, which are a special case.)
An old tool that you might not know about is located at https://www.nersc.gov/users/queues/queue-wait-times/ . This page gives you a "heat map" showing wait times for jobs of different size. Some people find this display confusing, so we're working on a display that will let you chose and job configuration (number of nodes and wallclock time) based on the number of MPP hours you need and recent queue wait time history. Look for this page sometime early in 2015.
Finally, if you haven't done so, please take the 2015 NERSC User Survey. This survey is extremely important to us because it's our best way to judge how well we're serving your needs and the primary way we report user satisfaction to DOE. We have a target of more than 600 responses this year, and we're at about 200 now. The survey is scheduled to close on Jan. 13, 2015. A detailed report on the results of last year's survey is now available at 2013 NERSC User Survey.