NERSCPowering Scientific Discovery for 50 Years

2007/2008 User Survey Results

Overall Satisfaction and Importance

  • Legend
  • Overall Satisfaction with NERSC
  • How important to you is?
  • General Comments about NERSC

 

Legend:

SatisfactionAverage Score
Very Satisfied 6.50 - 7.00
Mostly Satisfied - High 6.00 - 6.49
Mostly Satisfied - Low 5.50 - 5.99
Somewhat Satisfied 4.50 - 5.49
ImportanceAverage Score
Very Important 2.50 - 3.00
Somewhat Important 1.50 - 2.49
Significance of Change
significant increase
not significant

 

Overall Satisfaction with NERSC

7=Very satisfied, 6=Mostly satisfied, 5=Somewhat satisfied, 4=Neutral, 3=Somewhat dissatisfied, 2=Mostly dissatisfied, 1=Very dissatisfied

ItemNum who rated this item as:Total ResponsesAverage ScoreStd. Dev.Change from 2006
1234567
OVERALL: Consulting and Support Services     3 9 13 91 310 426 6.63 0.71 0.11
NERSC security   2 3 34 18 98 246 401 6.36 1.01 0.05
OVERALL: Satisfaction with NERSC 1 4 3 11 36 172 220 447 6.30 0.92 -0.01
OVERALL: Available Software     3 19 43 157 176 398 6.22 0.87 0.24
OVERALL: Software management and configuration 1 1 3 28 28 148 174 383 6.19 0.98 0.14
OVERALL: Network connectivity 2 6 8 25 33 154 196 424 6.13 1.13 -0.14
OVERALL: Available Computing Hardware 1 7 8 13 44 178 180 431 6.12 1.05 -0.06
OVERALL: Mass storage facilities 4 1 11 38 19 98 173 344 6.06 1.28 -0.10
OVERALL: Hardware management and configuration 3 2 10 32 39 164 146 396 5.97 1.14 -0.09
OVERALL: Data analysis and visualization facilities     6 68 28 67 62 231 5.48 1.24 0.11

 

How important to you is?

3=Very, 2=Somewhat, 1=Not important

ItemNum who rated this item as:Total ResponsesAverage ScoreStd. Dev.
123
OVERALL: Available Computing Hardware   47 347 394 2.88 0.32
OVERALL: Satisfaction with NERSC 2 59 359 420 2.85 0.37
OVERALL: Network connectivity 2 90 295 387 2.76 0.44
OVERALL: Consulting and Support Services 6 89 304 399 2.75 0.47
OVERALL: Hardware management and configuration 14 132 221 367 2.56 0.57
OVERALL: Available Software 23 130 219 372 2.53 0.61
OVERALL: Software management and configuration 23 144 189 356 2.47 0.62
NERSC security 45 154 168 367 2.34 0.69
OVERALL: Mass storage facilities 53 129 162 344 2.32 0.73
OVERALL: Data analysis and visualization facilities 117 90 79 286 1.87 0.82

 

General Comments about NERSC:   87 responses

  • 44 Comments about NERSC services / happy with NERSC
  • 27 Franklin comments
  • 11 Bassi comments
  • 5 Network comments
  • 4 Charging and Allocations comments
  • 3 disk quota comments
  • 3 PDSF comments
  • 3 Security comments
  • 3 Software comments
  • 8 other comments

 

Comments about NERSC services / happy with NERSC:   44 responses

NERSC systems seem very well managed. Job turnaround time is excellent, much better than other systems that I've used, even for very large jobs. Web site information is also terrific. I have no problem finding information about software, queue policies, etc. for each machine. Keep up the good work!

NERSC continues to be user oriented, with quick responses to problems.

... From an overall satisfaction, "extremely satisfied" was not an option. I would have chosen it if it were. The staff of this facility put the High Performance in HPC.

NERSC is an invaluable resource to the completion of my doctoral dissertation research. The consulting and web documentation is amazing. Smooth sailing and reliable from start to finish. ...

The process for requesting a start-up account was very straight-forward, and I received a very prompt reply.

NERSC is a model for all supercomputer centers.

Being at a small college without many resources, access makes the difference between having a great research programme and barely getting by.

NERSC is one of the finest computing facilities that I have used in the US and abroad. The level of support is unmatched.

I think they are doing a super job. I can not think of any other place better than this one. ...

I am utilizing a fair amount of the new resources. I am deeply impressed with NERSC programs and how they cater to the user. First class!!!

In general, I have found NERSC to be very responsive to complaints.

... I have always received great support over the phone and by email.

Relative to other high-performance computing facilities that I use, I feel that NERSC is well managed and does a very good job with technical support and computer usage policies. ...

Overall I think that NERSC is a very well-run computing center (compared to others with which I have had experience). The people are very helpful and informative, the website is well-designed, with important information that is easy to find. The tutorials at NERSC or LBL are very useful. ... Overall, though, good job.

In general NERSC has the best user support and reliability of all the other computer resources that I have ever used, ...

NERSC continues to provide a superior level of technical expertise and user service on its consulting and hardware and software management staffs. ...

NERSC is an excellent center in terms of how it is run, and the services and resources that it provides to users. ...

Reliable, well-managed, essential in my research.

During the period where the HPSS system was having problems storing large numbers of relatively small files, I found the support staff to be very helpful and proactive

NERSC has very very _very_ professional administration and user services. I am very impressed with their quality of service and with the effort and dedication that their staff puts into making the clusters easy to use and making lots of useful software available. I worked a summer at LLNL and found LLNL's system administration pathetic by comparison.

You should add another tier in the available choices for response called "extremely satisfied" so I can check this instead of just the "Very satisfied" that exists right now. I am extremely satisfied with the professional way that NERSC is handling the user community and the immediate response to any problems that arise from time to time.

... We get a lot of help in the material science area from user service.

Excellent consulting service. Many thanks. And particular thanks to Zhengji!!!

Account support uniformly splendid; consulting also excellent with one exception.

Very good consulting and account support services. ...

I find the NERSC people quick to help and solve problems. Thanks,

good job. thanks.

NERSC is an unparalleled facility with excellent support and a commitment to making the machines work well for user applications that makes it very productive. The new machine Franklin is coming on line well for us. The visualization group is very helpful, and we appreciate NERSC staff efforts to help us with allowing visualization codes to run on the new platform.

Thanks very much for access to your wonderful facilities. It is very useful to my work

Better than ORNL!

Really great computing facilities and user support services!

NERSC's stability and level of readiness make it my preferred site for model development. ...

Overall I have been very pleased with the resources and service available at NERSC. I've been particularly pleased with how well NERSC has managed the roll-out of Franklin which, like all new hardware has had a few bumps, but the staff has been very good about making quick fixes and communicating the status to the users. I've also had very good telephone support for account issues. I also like the on-line queuing page for Franklin.

I am very grateful for everyone at NERSC for keeping this great facility of great scientific importance to be functional, reachable, and above all, easily usable. Best regards to you all. [Jacquard user]

NERSC is amazingly well run, and has a fantastic staff. Having worked with NCCS in the past year, I appreciate NERSC even more.

NERSC continues to be a reliable place to get my computational work done

NERSC remains a great resource for scientific computing, and our group relies on it very much for computational research. ...

good to have such a powerful facility available.

I am very happy with the opportunity to use the NERSC facilities. The access to large scale computing facility been extremely important to my work.

NERSC is an excellent means of advanced computing. I have been connected since the inception in one way or another, starting with John Killeen. I am very satisfied and very grateful for this professionally run facility.

Keep up the good work.

... Very good computer sources to do computationally expensive jobs.

NERSC is undoubtedly one of the most important resource available to thousands of researches in various disciplines of science and engineering. For state-of the art research in certain areas of pure and applied scientific research, NERSC is sine quo non as it alone can offer memory, disk space and CPU run time which are unmatched anywhere else. I my computational research in superheavy chemistry , we have run jobs which I could not have run elsewhere.
I realize that there are limitations at present on the parameters mentioned above, but I am sure these would be modified in future to meet the needs of the ever increasing community of users at NERSC. We are grateful to the DOE for offering this wonderful NERSC facility for research to the world-wide users.

... Overall, this has been a VERY useful HPC resource!!

 

Franklin comments:   27 responses

Franklin reliability

Hardware stability is a big issue. Please consider stable machines and options when upgrade new machines. ...

The transition from Seaborg to Franklin has been more difficult than I would have expected, mostly because of Franklin going down so often. This appears to have been fixed, so I would expect my answer of "somewhat satisfied" on Hardware management and configuration to change to "very satisfied" as Franklin achieves the reliability that Seaborg had. ...

It's not difficult to see why many people do not use Franklin -- the worst compute platform of my entire computing career (which started on a Burroughs B5500 in 1964 at the University of Denver). ...

Franklin is down too often...

General reliability of Franklin has been disappointing of late. ...

... However with the phasing out of the IBM Power 5 and its replacement by the Cray XT4, the overall reliability of NERSC systems has taken a definite downturn, with a noticeable negative impact on the productivity of our projects.

Although the queue time is short, Franklin seems unstable, compared to seaborg and bassi.

Franklin in particular has been unacceptably unreliable and hence a lot of CPU hours as well as human time were wasted running jobs that crashed at random. This also made diagnosing errors much more difficult. In addition, the many scheduled and unscheduled maintenance sections makes using this computer very frustrating. Franklin is far less reliable than any other computer cluster or supercomputer that I have used in the past. ...

... But in one area it has been a big disappointment this year: The poor stability and availability of Franklin.

Main problem is stability of the system and frequent unscheduled maintenances!!!

... However, the Franklin system has been a step backwards in terms of reliability. This has contributed to very long queues on Bassi, which is probably serving more types of computations than what was intended when purchased.

Too many outages!

Franklin is great when up. Unfortunately, most of the time it seems to be down. This has severely crippled my research activities this year. I cannot even transfer my data on more reliable systems like Bassi, simply because Franklin is either not up, or crashes during the process.
It's quite obvious that whatever problems Franklin has are of the severe sort, and I don't believe the daily few hour maintenance breaks resolve them. I'd rather see Franklin down for a week or longer period, if that prevented the random daily crashes.

Franklin is not a very stable system. ...

... however, Franklin is always going down and that disrupts the flow of our work. I think once the people in NERSC get Franklin fine tuned, NERSC would go back to the top notch reliability that all its users are familiar to.

... There have been some notable problems with new hardware such as the Franklin machine, but that is somewhat to be expected.

I think that Franklin is still stabilizing, ...

Difficulties using Franklin

The transition from Seaborg to Franklin has been tough! Bassi seems to be very congested as a result of Seaborg being taken down as many users not having ported their code to Franklin. It seems like NERSC could perhaps be a little more proactive in helping people to port to Franklin. In a best case scenario, consultants could offer to take code that compiles and runs on Bassi or Seaborg and port it to Franklin for users.

It takes an extraordinary effort to get my code working optimally on NERSC machines, if even possible, and as a result, I use these machines only as a last resort. I can't tell if this is because the system settings are too conservative or because Cray is not sufficiently interested in usability. Honestly, it took me one day to get my stuff running beautifully on BlueGene/P before the machine was even officially in-service, yet I have not run a single non-trivial job on Franklin except by using settings which render it no more powerful than a 32-node Infiniband cluster.
Possible solutions:
1. Buy a machine for big-memory applications, ie 8 GB per node USABLE memory.
2. Force Cray to create a manual for Franklin which allows a reasonably-intelligent code developer to figure out optimal memory settings without having to monopolize a consultant for multiple days and devote a week to tuning these via a guess-and-check approach.
3. Provide code developers of critical DOE software with time specifically for debugging purposes so they don't have to waste 25% of their allocation, which is supposed to be used for science, on this task.

... Licensing issues with compilers can be frustrating. ...

... And the computation time limit is too short (24hr).

... My only criticism of the franklin is the large reduction in available memory after bassi. For those of us who maintain large suites of codes, a lot of restructuring was required to fit codes onto 1.875 Gb of memory. I do not understand why this is not closer to 2.14 Gb. If people need profiling /debug and large MPI overheads can they be put on a subset of the compute nodes?. The 'result' is more important to me than how fast i got it.

Problems with Franklin performance

At times it is very difficult to work on Franklin. Pulling up an x window is so slow that I try to edit files using emacs without and x window. Even then it can be terribly slow to do anything. I press a key and 30 secs later I get a response from the terminal. I assume this is because so many people are logged into the head node. There needs to be some way of managing or upgrading the resources to make Franklin usable.

... Franklin file system is sometimes very slow. ...

... Franklin sometimes responds very slow especially when the command needs HD access.

My overall rating for NERSC is skewed by the little time that I actually run there, and primarily is a comment on my issues with Franklin [performance variability]. As such, my opinion should be given very little weight. I do not consider myself to be a NERSC user, per se.

Happy with Franklin

... The new machine Franklin is coming on line well for us. ...

In general, I have been very happy computing on Franklin. I was able to get a lot done and when there was any problem the support staff was very responsive. ... I finished a run on Franklin and have been fairly inactive the past several months.

... Otherwise, it's a nice, fast, easy-to-use machine, with sensible scheduling policy - it's easy to be productive if the machine stays up (now that we have a sensible amount of disk space). ...

I have been using NERSC for just three months and I am very satisfied, it is an important computational resource. The results obtained within DFT molecular dynamics simulations in the last month improved a lot my research activity. I am very satisfied of this centre, in particular the CRAY XT4 has been very useful because of the low time that you have to wait before getting a running job.

One nice thing is that I can choose between queues. For example, I can use the debug queue to run a test code in a couple of minutes. If I have to run a code who needs CPU time of hours, then I can use the regular queue.

 

Bassi comments:   11 responses

... The only major issue that arose was the decommissioning of Seaborg. However, once I transferred to Bassi, I never looked back. Bassi is a beautiful architecture. It can get a little crowded at times and I don't get the turnaround on jobs that I had on Seaborg, but it is worth it.

Having access to bassi especially strongly helps my research projects

... but bassi has been a great platform to run on.

... It is also my preferred site for all but my largest simulations, where NERSC's success creates its one weakness: it is so popular that it is hard to get through the job queues.

Waiting queue is too long today..

... Bassi is reliable but the queues are so long that it takes weeks for even a small job to run, therefore rendering it quite unusable.

I am waiting too long on queue

Bassi queue time is so long. ...

The management of INCITE accounts is detrimental to the entire community. There has to be a better way than allowing INCITE accounts to jump to the head of the queue. It means that until INCITE accounts use up their allocations, NERSC is unusable except for low priority computing by the average user.

The waiting time for Bassi is too long. Otherwise, I am satisfied with the service by NERSC team.

... Queue is sometimes too long. Interactive job should run immediately. It may also be a good idea to have interactive or debug queue for large number of processors since we often want to take the timing before the big job. ...

 

Network comments:   5 responses

My experience is biased as I work from France and that makes connection very slow.

... My comment above about network connectivity is less positive as I seem to recall rather slow transfer speeds to some of the NSF sites and FNAL. I was moving many output configurations to other centers. These files tended to be on the order of 6 GB, as best I recall. ...

We run nightly tests which require connectivity to nersc systems. In more than a year, we have had tests fail only once due to lost connectivity to nersc. Excellent job! ...

The data transfer time between the scratch disk on Jacquard and my "home" linux computer is still rather slow. I don't know if it's due primarily to the NERSC side, my side, or a combination of both. ... [University of North Carolina at Asheville user]

I have been only able to achieve upload and download bandwidth of 400KB/s from Rice University using scp or http. This is unacceptably slow. Is there a better way to move data? The problem is not at Rice. Somethings seems to be limiting bandwidth within ESNET.

 

Charging and Allocations comments:   4 responses

... One comment is that maybe they should charge half of what they are charging on Franklin, so that I can use them more frequently, and be more productive.

The main problem we had working with NERSC was the ease with which it was possible to blow through our entire allocation before we realized the we had even done so. This occurred for a number of reasons.
1) The accounting system is not very intuitive and is geared towards mistakes unless you actually read up on it carefully (I don't think many people actually do this). It seems to me that using a node for 1 hour under regular circumstances on the flagship machine should be counted as 1 node-hour and everything else should be adjusted downward from that (except possibly using a high priority queue). The fact that you need to consider an extra 6.5 multiplier when using Franklin is asking for trouble. We won't forget to do it in the future, but we found out about this the hard way.
2) The allocation management seems to be non-existent. There is no indication that you are running down to the end of your allocation or that you are asking for more resources than your allocation can provide. On our MPP2 machine at PNNL you can't get on the queue if you request a job size that overruns your existing allocation but there seems to be no such restriction at NERSC. MPP2 will also provide a warning that you are asking for more than 20% of an existing allocation, which provides another opportunity to check that you may be doing something stupid.

At some point it would be helpful to change the charging units to more up-to-date units such as Franklin CPU hours.

... My only problems is with the charge factors used for machine usage. A charge factor of 6.5 on Franklin means that I use a huge fraction of my allocation on just a few large runs. Although there is a large-scale run reimbursement program in effect, I am currently stuck unable to use NERSC's machines because the reimbursement has not yet been credited. ...

 

Disk quota comments:   3 responses

Hope more quota. [Franklin user]

Disk quota should be increased. [PDSF user]

I am dissatisfied with the disk policies on the NERSC machines. If memory serves, I had a very small (almost useless) allocation in my home directory, and the area where I could work without worrying about hitting quotas would get wiped out after so many days of non-use. [Jacquard user]

 

PDSF comments:   3 responses

I have not used PDSF extensively lately and am rather reluctant to start doing so because of the very large number of hardware (disk) problems. In principle I would want to rely heavily on PDSF for my work in ATLAS, but it needs to become FAR MORE reliable than it has been lately. For me to rely on PDSF, I need to know that PDSF will be available (except in very rare cases) when I need to use it during the days before an important talk, conference etc.

PDSF has seen a change of lead in 2007; the ratings above reflect ratings integrated over the year and thus include both "before" and "after" this change. The new lead is on track to improving my overall satisfaction with NERSC and the specific PDSF configuration.

Disks and network access are sometimes slow.

 

Security comments:   3 responses

The lack of having to use a SecureID or equivalent makes NERSC a much nicer place to use than the competing large computing facilities at ORNL, EMSL, and NCAR.

The 'strong password' requirements are annoying. I tried to change my password about a dozen times, and each time the system spat back something different wrong with it. While I can appreciate wanting a secure system, I think that requiring both lower and upper case, numbers, and a special character, plus a minimum length, plus I forget what else, increases the likelihood users will have to write down the damn thing in order to remember it - which isn't very secure!
The other supercomputing systems I use (NCAR, ORNL) have gone to OTPs which IMHO work well.

I answered neutral to security because I'm not sure how DOE would respond to a "dissatisfied". I am dissatisfied because on 4 occasions our database server has been erroneously blocked by NERSC security without informing us that they were blocking it. On 2 of these occasions it happened on a Friday afternoon before they left for the weekend. I'm dissatisfied not because NERSC isn't secure, but because the security is getting in the way of our work, without basic human cross checks like calling us to ask us about a certain traffic pattern.

 

Software comments:   3 responses

In the old days of newton.nersc.gov it was easy to use Mathematica and Matlab at NERSC. Now there are those problems with fonts that make these too important tools virtually unusable for me.

I want to use HDF5 1.8.0 since it has high level API for fortran. ...

please keep NAG-SMP

 

Other comments:   8 responses

... I was troubled with frequency with which DaVinci was down earlier this year.

Da Vinci is a very important part of my work. This machine works great! HPSS also works great!

Not a big deal, but it would be nice if the HSI command line interface was more like a *nix shell, with file name completion.

... Jacquard is somewhat ok. But queue time is long

1. The Wall time, sometimes, is not enough. Because, some jobs cannot be broken into smaller parts. Further more, asking for to many processors will result in a long queue time. Adding these two together prohibits of using NERSC computers for specific jobs.
2. I have suffered quite frequently from jobs crashing because of malfunction of one the nodes. I had not the time nor the will for keeping track of lost time. I found that quite disappointing.
[Jacquard user]

I am planning to use data analysis services in the future. This is an important feature. NERSC should definitely support data services.

Nearly all of our data analysis was done on DoD supercomputers - primarily because our graphics person was DoD.

My bad scores reflect my bad experience using Jacquard. I took quite some time to set up our software (hotspotter, a genetics program), because the storage was so limited that it was even difficult to compile the program with all libraries without exceeding the disk quota. Even when using the scratch space, not all data (SNP data of all human chromosomes and the corresponding hotspotter results) would fit simultaneously into the allotted quota.
After many hours of work, the system was finally running, and I noticed that on Jaquard I was allotted 4 (!) cores. I have an 8-core Macintosh sitting on my desk with many gigabytes of hard disk space. I figured that my own Mac is more of a supercomputer than what I was being offered at NERSC.
I did send an email about that issue, and admittedly it was explained to me that Jaquard and the queue I used is for low-priority jobs and other systems have more to offer. However, this was not sent to me as an email, instead I had to log in and view the request ticket status. I was not aware of having to log in, so I thought that my request was not being answered. By the time I figured out that I have to log in to view my response, I had already solved my computational tasks using computational resources not involving NERSC.
Suggestions: Old clusters and queues offering only 4 cores should be shut down, they are very discouraging to users. Maybe a small login message that other clusters have more to offer would have avoided this misunderstanding.
Another issue is that in genomics research, just as important as processing through-put is that it is possible to store data on a genome scale. About 50GB would be useful for me. Again, this is easily offered by my desktop computer, so one expects to have at least this amount available on a supercomputer.