NERSCPowering Scientific Discovery for 50 Years

2010/2011 User Survey Results

Comments - What can NERSC do to make you more productive?

Queue suggestions and comments

Longer wall times

Longer wallclock times available on some of the large machines would help me.
longer available job times
Longer job running times.
If possible, l need longer Wallclock time for my jobs.
Increase walltime when one runs VASP.

At the moment you can open an additional long-time queue for VASP. The code is well scaled up to ~1024 cores and would benefit much more from uninterrupted runs. Would be great to have at least 48 hours of running. Thank you

Increase max run time for jobs

The walltime is somewhat usually short and it sometimes make me work less productively.

more flexible queue strcutre may help. for example for some smaller core jobs, extending the wall clok limit can be very helpful to me and save me time.

I think maybe allowing some more special queues for longer jobs.

Increase walltime one can request. There are some jobs we just can't finish in the allotted walltime and the problem is that they can't be restarted that well.

I wish I could request more hours (several to many days) for queued jobs

If the time for the queue can be facilitated, it would be great.

more flexibility with the walltime limits for parallel jobs

Increase max walltime in Hopper to 48 hours when using less than 16 nodes.

Introduce on Hopper a 48 hour queue.

Increase the wallclock limits on Hopper to 48 hours to bring it into line with Franklin andCarver/Magellan.

increase the maximum run time from 24 to 48 (or longer) on Hopper. Many of us need to run the same job for days if not weeks and having to restart the job every 24 hours is very counterproductive. The present queues on Hopper are not so long as that some fraction of the machine (say 25%) could be allocated to allow for longer queues (particularly for smaller jobs). I know this would be appreciated by a large number of users.

Again as I commented. It would be awesome and helps me a lot if Hopper has a longer queue (e.g., 36 hours or even 48 hours) than that 24-hour queue. Some of my jobs could not be finished in 24 hours in the reg_small queue, and I had to resubmit to finish them.

The running time limit right now for Hopper and Franklin is at most 24 hours. Therefore our jobs have to frequently write to a restart file and we have to manually restart jobs using the restart files every 24 hours. Many of our jobs would require running time much longer than 24 hours to produce statistically satisfying data. If for some jobs the running time limit can be extended to something like 48 hours, then we will be much more productive without baysitting jobs all the time

The time limit of franklin is too short, and it is supposed to be able to run 48 hours. But it is not. The reason is unclear.

Many Hopper queues have been increased to 36 hours.  We will consider increasing it to 48 hours in the new allocation year.

Shorter queue wait times

As always, decrease wait time, increase wall time.

NERSC should have better turnaround time in the queues, as well as longer runtime queues.

get batch wait times down..hard to do with out more resources.

Reduce queue times :)

Hum... Buy more computers so that my jobs don't have to wait in the queue.

Reduce queue times. Improve the queuing system?

My only quibble could be with batch wait times and turn-arounds, but this is really no problem (when we place this issue in the greater scheme of things).

Less wait time in Queues. ...

More computing resources = less wait time.

Make the number of hours allotted per job on the queues longer without increasing wait time!

The only remaining big problem is the several-day batch queue wait times. I don't know how to change that, unless there is a general policy of over-allocation of the resources If that is the case, it may reduce the wait times to better align the allocations with the time available. If the large job discount is heavily used, that should be taken into consideration when determining if the systems are over-allocated.

shorten the queue time of Franklin

Decrease wait time on Franklin.

There were times during which my throughput on franklin seemed very low. Hopper has been a valuable addition.

Try to improve the wait time on reg_small on Franklin.

One of the biggest limiting factors is the long queue wait times on Carver.

make the carver queue go faster (increase the size of carver)

Hard to get jobs thru on Carver.

better interactive/debug queue on hopper, sometimes the delay is too long

the new hopper system is great (maybe you should have another one of these to increase the speed of the queue

Better queue turn-around on smaller (8-128) node jobs.

I run lots of very short jobs (~15 minutes or less) jobs that are highly parallel (256 to 8K processors) for debugging purposes, to debug problems that don't occur with fewer processors. If there were a way to speed up turnaround time for such jobs it would improve my productivity dramatically.

Queue management suggestions and comments

A more accurate "showstart" command would be very useful for planning computing jobs and analysis.

It would be useful to have an estimated start time for submitted jobs. Although jobs enter the queues in order, certain jobs that are either small or short run much sooner than other jobs. In terms of planning work, it would help to know, roughly, when a given job should run.

I would love to see a column on the "queue look" webpage thatgives the estimated time to job start

We agree that the showstart command is not useful for any job other than the top one waiting in line.  We've found predicting start times to be inaccurate as running jobs end sooner than the time requested.  Additionally, jobs submitted to high priority interactive, debug and large queues change start time estimates.  On the queue pages we have added a column which shows a job's position in the queue which we hope will give users a better idea of how much longer they have to wait. 

Queues are a bit slow. My jobs sometimes fail in the middle of a run for no reason. I can restart and finish them, but I have to wait in the long queue again. Could there be some mechanism for identifying failed jobs so that they can be given higher priority when they are restarted?

NERSC can develop a system for monitoring VASP jobs. There is a possibility to interrupt VASP softly, allowing it to write the charge density matrix. The problem is to estimate whether to start a new optimization step or write the matrix and finish the job. Killed jobs leave no charge density, making restart more time consuming.

Increase the number of small jobs one can have running simultaneously on Hopper.
Allow automatic reassignments of Carver jobs to Magellan, when Magellan is underused.
I'm sure there might some non-trivial scheduling issues associated with this, but there are certain aspects of our work that require large numbers of jobs (100s-1000s) that are not very parallel (i.e., don't scale well beyond 1-4 nodes). Clearly it would be nice to run these on NERSC because the overall computing requirement is still large, but the batch queues don't really facilitate running many "small" jobs(despite being relatively easy to schedule as "backfill"). The most obvious solution I can think of -- essentially limiting max number of simultaneous cores instead of max number of simultaneous jobs -- defeats one of the main purposes of NERSC and is thus not a good one. Nonetheless, it would be nice if there were a way to do this. Obviously we're pretty happy with NERSC though, so this is not a deal-breaker...

Right now the queues are pretty heavily biased against users running many smaller jobs on fewer CPUs, but this has been improving!   For my systems, the former limit of 12 H on Hopper for small jobs limited the usefulness of hopper, but now the 24 H limit is very favorable. The same can be said for increasing the queue limits to 8 running and 8 queued jobs. I would prefer slightly higher queue limits for smaller jobs, but the recent increases have been very welcome.

On the occasions when I need to run many jobs requesting only one node, the scheduler only allows for a handful to be run at one time.

NERSC recognizes that many science applications require thousands of jobs on a relatively low number of cores.  We have recently increased the number of concurrent running jobs allowed on Hopper and we've also added a serial queue on Carver.  Users with special requests should contact the consultants.  We have the capability to reserve a block of time or nodes for users with special needs.

Quantum Monte Carlo is one of a few methods that can readily exploit high concurrency computations. From this point of view, discounted charge factors for large queues would be helpful.

It will be very nice if the charging factor of the computing time is much lower, because the cost of large job is really expensive.

better back-up (xfer) queue management

One suggestion I had from the evaluation period was to create an xfer queue on Hopper, but I see this was recently done. 

allow time-critical batch jobs to have higher priority. Not sure if this is feasible or not.

Users can submit to the premium queue to gain higher priority.  The premium queue has double the charge factor and so should be used only for time critical jobs.

better control over job requests, e.g. being able to request physically contiguous sets of nodes on hopper.

Add a premium batch queue to Hopper.

A premium queue now exists on Hopper.

I have noticed several times recently that sometimes there are no jobs in the eligible queue and Hopper is not running at full capacity. I think it would be a good idea that when this happens the maximum number of jobs that a user can run can dynamically be increased from 6 to a higher number, that way Hopper does not go underutilized at any time. Wasted cycled are never a good for anyone.

I like the concept of 'scavenger ' modeat Kraken, under Teragrid. Jobs whose results would provide insight to the production calculation , or as a further check of the result , you can run there at no cost to the main account .... but at very low priority If NERSC was to have a queue , that one could submit to, that in times where a machine is under utilised, a calculation could be run.

1) It would have slightly less priority than low
2) It would fulfil .... 'just one last check under this condition'
3) I need to profile my code, but i don't want to burn the main account
4) It might produce an interesting scientific result, when the code is run under an extreme condition(s), but you would not risk the time usually.

A scavenger queue has been added to Hopper and Franklin.  Thanks for the input.

The long run ....... (bend the rules slightly option)
For long runs (48 hrs to 96hrs) , people could submit jobs at say up to double  time, with half the priority , and perhaps 200% more cost. So it would enable the  'last check' run , but with enough cost that they don't do it on a regular basis.

Regular queue drains on hopper sound good; I am the type of user who can  take great advantage of those.

 I find that I only use HOPPER. I used to also use FRANKLIN, but I started to have trouble submitting jobs, so I just stayed with HOPPER instead. For me personally, it is more efficient to work on a single machine. Much easier than having to remember which jobs are running where. But as long as HOPPER isn't overloaded, I don't mind other users working on other machines!

Software suggestions and comments

Would be nice to see the Intel compilers on Hopper.
The presence of Intel compilers on Hopper and Franklin!!! We have problem compiling some codes based on advanced C++ templates techniques (mostly in Boost library) with PGI, Pathscale, and Cray compilers. GNU is OK, but we also need some additional features, such as quadruple floating-point precision in Fortran, etc.
I also like to use Midnight Commander. However, I haven't seen it on any supercomputer. Don't know why :).
I am currently unable to use hopper due to a plethora of problems of my application code with the various compilers/libraries available on hopper. Ironically, my Cactus-base code works just fine NICS's kraken (an XT5), but has all sorts of memory/MPI/OpenMP issues on hopper, leading to incorrect results. We are still working on tracking down the problem, but it has been a major drain of time for my group. Unfortunately, our code is so complex that we cannot easily hand it over to the consultants to 'let them fix the problem'. What would help, though, would be getting the Intel compiler suite installed on hopper.
Please, please, please provide an intel compiler environment on hopper2. Our group tried hard using the pgi compiler, but the invested time so far isn't worth it and we now switched to gcc despite the possible decrease in simulation performance. We have a lot of experience with the intel compilers on other systems. It typically produces the fasted binaries from our codes even on non-intel cpus, and while also the intel compiler has some problems, they are much more managable than with pgi.
The Intel compilers have been added to Hopper. (They are also on Carver.) Thanks for the feedback.
Make compiling and linking easier.
Cleaner development environment.
We have recently had problems building VORPAL or necessary packages with new PGI compilers and/or their Cray wrappers. (An example of this is the missing std::abs(double) on freedom.) Better compiler reliability before making a version the default would be good.
Vasp at hopper2 is not so stable, the speed is not so good, and would suddenly stop sometimes.
To the extent the environment on NERSC machines mirrors my environment my workstations and laptops, my work is more productive. Also, I can often do initial development on my (much smaller) machines and then transfer it to NERSC facilities when it is more mature.
To that end, things like support for shared libraries and software environments like Python are very help and increase my productivity. This has been more and more true for the newer systems, e.g., "hopper", but I want to stress that it is very important and should continue to be supported and developed.
Also, it would be REALLY nice to be able to cross-compile on my workstations and/or laptops for the NERSC machines. It would make the compilation process much more decoupled from the NERSC user resources and allow me to use personal workstations with 16-48 processors to accelerate compilation times!
Would be nice to see XEmacs installed.
Python is more supported than previously, but still find things that don't work.
Module system is getting out of hand.
Wider choice of software and tuning tools.
More python libraries available through "module load numpy" for example.
I still can not figure out how to run replica exchange molecular dynamics (REMD) with NAMD on Hopper and Franklin. If you can help me with it, I will appreciate it.
Also, is it possible to install Gaussian on Franklin or other clusters?
If Molpro were available on Hopper, our research would benefit greatly.
NERSC can add visualization software such as SMS and GIS, to ease my analysis of output data.
The NX server may greatly improve interactivity; that had been poor in the past.
Faster response for remote X windows.
Add gv to Hopper software for viewing .ps files
Add nedit to Hopper software for editing files 
Interactive debugging is a bit painful. Regrettably, I don't know how to make it less painful, but if you think of anything
Reliable, usable debuggers can come in handy. Totalview hadn't worked for me in years so I gave up on it. Always possible to fall back on print statements, of course.
Weird how IDL can't read some of my netCDF files.
I like profiling tools that i can interface with my code at the compile line
Good : poe+, ipm
Bad : cray-pat ( setenv PAT_RT_HWPC 5 ) ... if i have to google the profiling option it has just failed. or at least a default script that would simplify this for user, and take the pat_build, relabel, resubmit again, hassle away.
One problem I have had is regarding visualization software for my field. The software has been installed, but there is not someone to help with troubleshooting. I realize that users may need to be on their own in some instances, but it is a bit confusing to me that installed software is not supported by the consulting staff. If software is installed, it should be supported in my opinion. In my case, I search out HPC resources that have software I need installed, with the assumption that it will be supported by the consult staff. If this is not common or perhaps my assumption is faulty, then I realize this is not a NERSC issue but rather my own.

Hardware suggestions and comments

More FLOPS, disk and network bandwidth! (Does everybody answer this?)
Get sufficient DOE funding to be able to run separate capacity and capability systems.
More money for larger computers would be very helpful!
NERSC should buy a Blue Gene/Q system to better support low-memory applications with well-behaved communication behavior.
Also, it would be helpful to have more large-memory nodes available (at least 4GB/core).
Add some disk space to some carver nodes.
I really think if we could have directly attached storage on some machinesmy productivity would increase 2 fold.
It would be helpful if the number of nodes on Carver could be increased. Its per core performanceis so much better than Hopper or Franklin.
increase the size of carver
The hopper login nodes seem overwhelmed. I frequently see many users and several long-running tasks on the nodes. Difficult to find a "free" node. Would be nice to see more login nodes.
My primarily productivity "liability" is lack of time to invest in doing things better. I don't think NERSC can help with that. There are 2 things I wish could be better, however:
A Hopper with faster processors! It's great to have access to so many cores, but the slower speed (say vs. IBM bluefire at NCAR) means one has to use nearly twice as many to get the same throughput. At least Hopper is faster than Franklin, which is rather slow.
More available nodes on PDSF.

File Storage and I/O suggestions and comments

In Franklin, default quota in scratch is a bit small. Of course, we can make a request to increase for a while. But default can be a bit larger than 750 [GB].
Increase my quota on [Franklin] scratch from 750GB to 3TB.
More than 40 gigs of [HOME] storage space would be quite nice. I tend to fill up that space with a few high fidelity simulations and have to transfer off data to HPSS that I will need to access to do my data analysis.
The HD space in the home could be also enlarged to avoid wasting time in long back-up procedures.
User quotas are a bit tight, and therefore heavy use must be made of moving stuff back and forth from SCRATCH and GSCRATCH. This is esp true when I have tried to move data back and forth to euclid, e.g. to run a matlab session.
More available disk space between global scratch and PDSF would be helpful for cloud based computing buffered on carver.
Allow users to keep their raw data in global scratch disk for longer time (3-6 month for instance).
Keep /scratch2/scratchdirs/mai available on hopper.
I have a comment about the scratch purging policy. For the most part, my groups back up our important files from scratch locally (i.e. on our own drives) or on hpss but every once an while forget (particularly new group members).
I wonder if you have considered using the following: when doing a purge, first create a list of all the files that have not been touched beyond the cutoff date, then reorder those files in terms of size, then delete the files starting with the largest until a threshold has been reached, but leaving most small files intact. This has the benefit of deleting most large intermediate files but preserving the small but important input / output / result files.
The global scratch is great, however I can not write my data directly on there because of the I/O performance. I first need to run on Hopper /scratch and then transfer my data which takes a bit of time. How about having a serial queue available on Hopper to do this kind of thing? An xfer queue to transfer data to the hpss would also be helpful.

Also it would be great if Franklin was also linked to /global/scratch
Due to the large size and amount of data generated from my computation, I have to perform most post-processing, data analysis and visualization in my account at NERSC. Unfortunately, transferring data from the scratch space of the computing machines such as Franklin and Hopper to the analysis/visualization machine such as Euclid has become increasingly inconvenient and inefficient. The ability to perform data analysis/visualization at NERSC has been strongly comprised.
It would be highly desirable to have a shared, global scratch file system that allows simultaneous accesses from both the computing machines and the analysis/visualization machine, just as the way the $HOME directory has been setup.

Increase the speed of tab completions. For some reason (I believe it's latency of the global file system, but not sure) tab completion of commands and file names is slow on NERSC machines lately. I know it's a a small complaint, but slow auto-complete can really break your command line rhythm.

I just feel that the I/O on Hopper $SCRATCH is sometimes slow and not very responsive. Hope this can be solved after the planned upgrade.

Nothing that you're not trying to improve already - specifically the performance of disk space.

Improve I/O performance for data intensive computations.

The time to build code on the login nodes for Franklin and Hopper is very slow compared to other big parallel machines I have used, specifically, running configure scripts and compiling Fortran code. When doing a lot of development, this performance is very important and improving it would be helpful. 

smooth out internode i/o on new Cray systems [there have been a few software issues on hopper that affected parallelization of some internode i/o intensive jobs]

HPSS suggestions and comments

I don't know if this is possible, but it'd be nice if the hsi command line interface had tab completion. For example, if you enter paths or file names. I've requested this before, & I think the answer was that it was not possible.
A better shell interface to HPSS with hsi would be nice; for instance tab completion would make using the system easier.
HPSS sometimes wants to drop the ftp connection from my desktop and I am finding it easier to move the data to scratch and scp it from there.
If I can somehow store my data during batch jobs directly to the HPSS and also be able to retrieve data from HPSS directly to my local computer, I see my productivity increasing significantly.

It would be great to have the globus connection to the HPSS again.
htar really needs to support larger files. My largest wave function files now are rejected which is extremely inconvenient and requires using hsi to transfer just the one big file or separately compressing it. I'd like to have a backup archive of entire calculations and htar is failing. it should support the largest files possible on the systems at nersc, realistically developers are going to avoid fragmenting files that are logical single for the convenience of support applications.

HPSS file management can be somewhat time consuming.
The interface to hpss, via hsi and htar is a bit clunky. It would be good to browse tar files moved to hpss more easily than htar -t. Maybe some graphical browser?

Maybe there is a reason for this, but the interface for HPSS is just atrocious when I used hsi. Why can't I delete batches of files and use commands usually available to me in linux. Perhaps it is to protect me from myself in terms of not deleting my data, but it drives me crazy how bad the interface is. It take me probably 5 times as long to do things in HPSS with hsi because of this.
Is it possible to copy (synchronize) data (directory) to the storage server?
I don't like having to come up with my own backup solution. It seems that there are great tools available for backing up, yet I have to do this myself to HPSS. I've actually lost a lot of time when /scratch was erased trying to recover -- not data, but just my state. It seems like NERSC has decided to put the burden of backing up on the users. This criticism assumes that the disk quota for home is not sufficient.

Allocations and Account Management suggestions and comments

Would like increased allocation

Other than the increased allocation, ...

Larger allocations and machines

More hours.
Secondly, I think my allocation time should be increased each year. I normally applied for 2 million hours, but always got 1 million hours. But in the meantime, I notice some other projects got more than needed. Then, we are allowed to ask for additional hours. This practice has a limitation. It is difficult to plan things ahead of time. I think this is probably the responsibility of DOE not NERSC. DOE can check the usage of previous years for one user group, and then determine the future amount of time. Just an idea!
More allocation opportunities during the year.
more flexibility with the resource allocations

ERCAP allocation request processs

Simplify annual ERCAP renewal process and only require it on a 2-3 year schedule instead of every year.
Regards reversions of ERCAP allocations. I appreciate (and have benefited from) that account usage is monitored for unused CPU hours, and that such hours are then given to other users that can use extra time. However, users in our group are not doing "production" computations, but rather research/development on improved boundary conditions, elliptic solvers, etc. -- all in the general march towards using more cores. Typically there is much analysis of the code (as opposed to resulting data) required in this work (our code is an atmospheric model). Further, we all have other responsibilities that may dominate from time to time. As a result, we may have entire months where very few CPU hrs are used. When some advance has been developed and coded, then we may have months in which we use a lot of CPU time as we examine various test cases. I don't know how this kind of usage can be better "monitored", but the CPU reversions are problematic for us. However, in most reversion cases to date, we have gotten back at least the amount of the reversion by the end of the ERCAP year. So we are tending to watch for that possibility at years end and to be prepared to take advantage of it.
Please try to make the yearly web resource request (ERCAP) as "updateable" as possibly, so don't need to type in much new info....there is tyicaly not much change year to year.

Account management

One minor complaint: the login-in security does not allow more than 3-4 (?) failures on password and I often need a "reset"...more "failures" should be allowed.
The login lock-out policy due to repeated unsuccessful attempts is too restrictive: it is very easy to get locked out by a few failures to type in the correct password. This is particularly problematic since remote access through the NX Server is sometimes faulty, so the user does not know whether it is just the usual trouble or whether the wrong password has been typed (perhaps since he/she was forced to change the password recently). Moreover, the procedure to unlock the login is VERY annoying, since it requires calling the NERSC HELP desk, which is not staffed 24/7 -- a login lockout at the wrong time could easily mean losing a whole day's of access and work. Either the HELP desk should be staffed at all times or, better yet, an automatic password reset or other system should be implemented for users in the same manner as many publicly available account services.

The number of login failures allowed has been increased slightly.  NERSC Operations Staff are available 24/7 to help users clear login failures.

More flexibility in dealing with group authorizations and shared control of resources, processes & data.

Consulting suggestions and comments

Often it is hard to get complex problems worked out with the consultants.Some tickets are not resolved for a long time. If a solution is not obvious the attitude is to avoid solving the problem.

Ticket system not adequate for group use: one can't search for and comment on tickets opened by colleagues

.Response to questions and online help.

Have someone on call for software/environment problems on weekends (especially long weekends).

On that note, allowing wider access to exchanges with NERSC consultants via email or the online interface would be useful. Often I'm working together with other VORPAL developers (usually John Cary) to resolve build issues, so all of us being able to participate in discussions with consultants would be nice.

The amount of email coming from NERSC is a bit excessive.

Training suggestions and comments

More tutorials for new users
Would be nice to see more examples online. Examples of all sorts of tasks.
More specific test examples for using libraries like SuperLU, PARPACK and sparse matrix packages, so one could test quickly before making the switch.
Parallel computing courses
I hope to NERSC can offer more workshop/Seminar about their systems. For example, I would love to learn more about NERSC cloud computing, GPU computing facility. I hope NERSC can have this kind of seminar/web seminar more often.

Availability, Reliability and Variability suggestions and comments

Inconsistency in calculation times continues to be an issue for my large runs. I think this is something of an endemic feature to HPC and is only likely to get worse as system (and problem) sizes scale up. Variations on the order of 50% make it difficult to streamline the batch system by requesting only the needed time.
There have been some issues with lustre robustness and we've noticed significant model timing variability on hopper (2).
Reliability issues (jobs lost to node failures) are still frequent, and require a fair amount of time to check that jobs finished properly and to rerun failed jobs.
I have mostly used franklin at NERSC so far. My jobs often crash for a persistent system error of "node failed or halted event".
Franklin has been in operation for several years, but is still not stable. Lean on Cray to fix the problem.
Reduce Franklin downtime.
Two big problems: PDSF downtime for maintenance/upgrades is too frequent and intrusive; at the very minimum, it should be carried out ONLY at night and/or weekends (preferably weekend nights).
More uptime is always great when in need of it.
more uptime
Not much really, perhaps except extending uptimes of the systems.
Need to devise a method to improve fault tolerance. Lately I have been running jobs that requre ~16384 cores on Franklin. After 2-5 days of waiting in the queue to gain access to the cores, more than 50% of these jobs fail with 'Node failed'. It only takes one failure to kill the entire calculation. There must be a way to request an extra two or four cores that are sitting thereidle during the simulation, so that when one of the cores being used in the calculation drops out, the information it had in memory could be handed over to one of these waiting cores, tell MPI that this new core is the one handling the work, and then move forward again. Anyway, my biggest problem running large jobs lies in this issue. It also wastes a large portion of my allocation. It would significantly help productivity if NERSC could devise a relatively turn-key strategy for users to deal with machine-level failures.

Networking suggestions and comments

Also transfers from NERSC dtn_new to the users desktop are 1/3 as fastas uploads to NERSC.
Increasing the bandwidth between NERSC and external users will remain a popular demand, but this is not necessarily dependent solely on NERSC.
Increase data transfer bandwidth with BNL

No Suggestions / NERSC is doing well

productivity roadblocks are pretty much on my end (for now)
I have a hard time thinking of anything that NERSC could do better.
You are quite already a very efficient organization.
Keep doing what they have been doing for the last few years
Keep doing what you are doing!
Keep doing an outstanding job!