NERSCPowering Scientific Discovery for 50 Years

2009/2010 User Survey Results

Response Summary

Many thanks to the 395 users who responded to this year's User Survey. The response rate is comparable to last two years' and both are significantly increased from previous years:

  • 77.8 percent of the 126 users who had used more than 250,000 XT4-based hours when the survey opened responded
  • 30.9 percent of the 479 users who had used between 10,000 and 250,000 XT4-based hours responded
  • The overall response rate for the 3,533 authorized users during the survey period was 11.2%.
  • The MPP hours used by the survey respondents represents 66.8 percent of total NERSC MPP usage as of the end of the survey period.
  • The PDSF hours used by the PDSF survey respondents represents 20.0 percent of total NERSC PDSF usage as of the end of the survey period.

The respondents represent all six DOE Science Offices and a variety of home institutions: see Respondent Demographics.

The survey responses provide feedback about every aspect of NERSC's operation, help us judge the quality of our services, give DOE information on how well NERSC is doing, and point us to areas we can improve. The survey results are listed below.

We used the 2009/2010 User Survey text, in which users rated us on a 7-point satisfaction scale. Some areas were also rated on a 3-point importance scale or a 3-point usefulness scale.

Satisfaction
Score
Meaning Number of
Times Selected
7 Very Satisfied 8,053
6 Mostly Satisfied 6,219
5 Somewhat Satisfied 1,488
4 Neutral 1,032
3 Somewhat Dissatisfied 366
2 Mostly Dissatisfied 100
1 Very Dissatisfied 88
Importance ScoreMeaning
3 Very Important
2 Somewhat Important
1 Not Important
Usefulness ScoreMeaning
3 Very Useful
2 Somewhat Useful
1 Not at All Useful

The average satisfaction scores from this year's survey ranged from a high of 6.71 (very satisfied) to a low of 4.87 (somewhat satisfied). Across 94 questions, users chose the Very Satisfied rating 7,901 times, and the Very Dissatisfied rating 75 times. The scores for all questions averaged 6.16, and the average score for overall satisfaction with NERSC was 6.40. See All Satisfaction Ratings.

For questions that spanned previous surveys, the change in rating was tested for significance (using the t test at the 90% confidence level). Significant increases in satisfaction are shown in blue; significant decreases in satisfaction are shown in red.

Significance of Change
significant increase (change from 2009)
significant decrease (change from 2009)
not significant

Highlights of the 2010 user survey responses include:

  • 2008/2009 user survey: On the 2008/2009 survey Franklin uptime received the second lowest average score (4.91).

    NERSC response: In the first half of 2009 Franklin underwent an intensive stabilization period. Tiger teams were formed with close collaborations with Cray to address system instability. These efforts were continued in the second half of 2009 and throughout 2010C, when NERSC engaged in a project to understand system initiated causes of hung jobs, and to implement corrective actions to reduce their number. These investigations revealed bugs in the Seastar interconnect as well as in the Lustre file system. These bugs were reported to Cray and were fixed in March 2010 when Franklin was upgraded to Cray Linux Environment 2.2. i. As a result, Franklin's Mean Time Between Failures improved from a low of about 3 days in 2008 to 9 days in 2010.

    On the 2010 survey Franklin uptime received an average score of 5.99, a statistically significant increase over the previous year by 1.08 points. Two other Franklin scores (overall satisfaction and Disk configuration and I/O performance) were significantly improved as well.

    Another indication of increased satisfaction with Franklin is that on the 2009 survey 40 users requested improvements in Franklin uptime or performance, whereas only 10 made such requests on the 2010 survey.

  • 2008/2009 user survey: On the 2008/2009 survey ten users requested improvements for the NERSC web site.

    NERSC response: User services staff removed older documentation and made sure that the remaining documentation was up-to-date.

    On the 2010 survey the score for "ease of finding information on the NERSC web site" was significanty improved. Also, for the medium scale MPP users the scores for the web site overall and for the accuracy of information on the web showed significant improvement.

    1. Respondent Demographics
    2. Overall Satisfaction and Importance
    3. All Satisfaction and Importance Ratings
    4. HPC Resources
    5. Software
    6. Services
    7. Comments about NERSC
    • 73 respondents mentioned ease of use, good consulting, staff support and communications;
    • 65 users mentioned computational systems or HPC resources for science;
    • 22 highlighted good software support;
    • 19 were generally happy;
    • 14 mentioned good documentation and web services;
    • 10 pointed to good queue management or job turnaround;
    • 8 were pleased with data services (HPSS, large disk space, data management);
    • 4 complimented good networking, access and security.
  • The complete survey results are listed below and are also available from the left hand navigation column.  

    User Satisfaction with NERSC

    Areas with Highest User Satisfaction

    Areas with the highest user satisfaction are those with average scores of more than 6.5.

    7=Very satisfied, 6=Mostly satisfied, 5=Somewhat satisfied, 4=Neutral, 3=Somewhat dissatisfied, 2=Mostly dissatisfied, 1=Very dissatisfied

    Item Num who rated this item as: Total Responses Average Score Std. Dev. Change from 2009
    1 2 3 4 5 6 7
    PDSF: Uptime (availability)




    9 22 31 6.71 0.46 0.35
    HPSS: Reliability (data integrity)

    2 2 2 33 124 163 6.69 0.68 0.01
    HPSS: Uptime (Availability)


    2 5 41 115 163 6.65 0.60 0.02
    CONSULT: Overall 1
    1 5 9 59 212 287 6.64 0.74 0.12
    GLOBALHOMES: Reliability


    7 5 47 160 219 6.64 0.68  
    PROJECT: Reliability


    3 4 24 79 110 6.63 0.69 0.08
    SERVICES: Account support 1 1 2 6 12 73 243 338 6.60 0.80 -0.06
    GLOBALHOMES: Uptime

    1 4 7 58 151 221 6.60 0.68  
    CONSULT: Response time 1
    2 6 12 61 205 287 6.59 0.80 -0.01
    PROJECT: Uptime

    1 4 3 24 78 110 6.58 0.79 0.03
    PROJECT: Overall


    3 5 32 79 119 6.57 0.70 0.26
    CONSULT: Quality of technical advice

    1 7 14 67 190 279 6.57 0.74 0.09
    OVERALL: Services

    1 6 15 114 246 382 6.57 0.67  
    OVERALL: Security

    2 11 11 69 202 295 6.55 0.79 0.16
    NETWORK: Network performance within NERSC (e.g. Seaborg to HPSS)


    3 10 51 114 178 6.55 0.68 0.04
    WEB: System Status Info

    1 5 12 104 181 303 6.51 0.68  

     

    Areas with Lowest User Satisfaction

    Areas with the lowest user satisfaction are those with average scores of less than 5.5.

    7=Very satisfied, 6=Mostly satisfied, 5=Somewhat satisfied, 4=Neutral, 3=Somewhat dissatisfied, 2=Mostly dissatisfied, 1=Very dissatisfied

    Item Num who rated this item as: Total Responses Average Score Std. Dev. Change from 2007
    1 2 3 4 5 6 7
    NERSC SW: Data analysis software 4 1 4 36 13 43 44 145 5.47 1.47 -0.37
    NERSC SW: Visualization software 4 3 6 31 16 39 49 148 5.47 1.54 -0.45
    NERSC SW: ACTS Collection 3

    31 9 22 29 94 5.39 1.48 -0.54
    TRAINING: Workshops 1 1
    19 8 13 18 60 5.38 1.43 -0.21
    HOPPER: Batch wait time 1 5 16 18 35 45 26 146 5.19 1.41  
    FRANKLIN: Batch wait time 7 10 43 43 81 90 29 303 4.87 1.43 -0.68

     

    Significant Increases in Satisfaction

    The three survey results with the most significant improvement from 2009 were all related to the Franklin system. NERSC and Cray have worked hard to improve Franklin's stability in the past two years, and the improved scores demonstrate that these efforts directly resulted in improvements recognized by the users.

    7=Very satisfied, 6=Mostly satisfied, 5=Somewhat satisfied, 4=Neutral, 3=Somewhat dissatisfied, 2=Mostly dissatisfied, 1=Very dissatisfied

    Item Num who rated this item as: Total Responses Average Score Std. Dev. Change from 2007
    1 2 3 4 5 6 7
    FRANKLIN: Uptime (Availability)
    4 13 12 42 119 118 308 5.99 1.13 1.08
    FRANKLIN: Overall
    2 6 10 35 142 117 312 6.12 0.94 0.37
    FRANKLIN: Disk configuration and I/O performance 1 1 3 36 25 112 103 281 5.96 1.10 0.35
    PDSF: Uptime (availability)




    9 22 31 6.71 0.46 0.35
    PROJECT: Overall


    3 5 32 79 119 6.57 0.70 0.26
    SERVICES: Allocations process 1
    3 11 23 108 134 280 6.27 0.91 0.24
    OVERALL: Available Computing Hardware
    1 7 11 33 167 169 388 6.23 0.89 0.23
    OVERALL: Satisfaction with NERSC 1
    1 6 25 157 200 390 6.40 0.75 0.17
    OVERALL: Security

    2 11 11 69 202 295 6.55 0.79 0.16
    WEB: Ease of finding information 1 1 7 11 30 142 112 304 6.10 0.97 0.15
    CONSULT: Overall 1
    1 5 9 59 212 287 6.64 0.74 0.12

     

    Significant Decreases in Satisfaction

    The largest decrease in satisfaction over last year's survey was for Franklin batch wait time: as Franklin became more stable it also became more popular and batch wait times increased.

    7=Very satisfied, 6=Mostly satisfied, 5=Somewhat satisfied, 4=Neutral, 3=Somewhat dissatisfied, 2=Mostly dissatisfied, 1=Very dissatisfied

    Item Num who rated this item as: Total Responses Average Score Std. Dev. Change from 2007
    1 2 3 4 5 6 7
    FRANKLIN: Batch wait time 7 10 43 43 81 90 29 303 4.87 1.43 -0.68
    PDSF SW: STAR

    3 2 3 7 6 21 5.52 1.40 -0.67
    NERSC SW: ACTS Collection 3

    31 9 22 29 94 5.39 1.48 -0.54
    DaVinci: Disk configuration and I/O performance
    1
    7 3 12 16 39 5.87 1.28 -0.47
    NERSC SW: Visualization software 4 3 6 31 16 39 49 148 5.47 1.54 -0.45
    NERSC SW: Data analysis software 4 1 4 36 13 43 44 145 5.47 1.47 -0.37
    FRANKLIN: Batch queue structure 3 2 9 37 44 121 82 298 5.71 1.21 -0.19

     

    Satisfaction Patterns for Different MPP Respondents

    The MPP respondents were classified as "large" (if their usage was over 250,000 hours), "medium" (usage between 10,000 and 250,000 hours) and "small". Satisfaction differences between these three groups are shown in the table below.

    The smaller MPP users were especially happy with data storage resources, and the larger MPP users with consulting and web services. It is intersting to note that the larger MPP users were the least satisfied with Franklin's batch queue structure, wven though large jobs are favored on Franklin.

    Item Large MPP Users: Medium MPP Users: Small MPP Users:
    Num Resp Avg Score Change 2009 Num Resp Avg Score Change 2009 Num Resp Avg Score Change 2009
    HPSS: Overall 65 6.38 -0.05 58 6.28 -0.16 24 6.88 0.44
    PROJECT: Overall 42 6.40 0.10 37 6.62 0.31 20 6.80 0.49
    PROJECT: File and Directory Operations 36 6.31 0.10 34 6.47 0.26 17 6.76 0.56
    CONSULT: Overall 91 6.71 0.19 116 6.65 0.12 53 6.66 0.13
    CONSULT: Quality of technical advice 89 6.63 0.14 113 6.53 0.05 50 6.60 0.12
    Security 83 6.60 0.21 122 6.57 0.17 49 6.47 0.07
    WEB: Accuracy of information 91 6.49 0.17 120 6.42 0.09 55 6.27 -0.05
    OVERALL: Satisfaction with NERSC 105 6.44 0.21 148 6.30 0.08 77 6.45 0.23
    WEB: www.nersc.gov overall 95 6.39 0.11 130 6.44 0.16 58 6.34 0.07
    SERVICES: Allocations process 88 6.19 0.16 108 6.28 0.25 52 6.37 0.33
    TRAINING: New User's Guide 53 6.25 0.10 83 6.35 0.21 35 5.97 -0.17
    HPSS: User interface (hsi, pftp, ftp) 63 6.06 0.04 54 5.63 -0.39 23 6.35 0.33
    OVERALL: Available Computing Hardware 105 6.34 0.34 148 6.10 0.10 76 6.17 0.17
    OVERALL: Available Software 87 6.29 0.08 126 5.98 -0.23 64 6.22 0.01
    WEB: Ease of finding information 90 6.17 0.22 122 6.23 0.28 55 5.96 0.01
    FRANKLIN: Overall 103 6.16 0.41 132 6.05 0.31 61 6.11 0.40
    FRANKLIN: Uptime (Availability) 102 6.12 1.21 130 5.96 1.05 59 5.80 0.89
    DaVinci: Overall 13 5.31 -0.91 11 6.18 -0.03 9 6.11 -0.10
    SERVICES: Data analysis and visualization consulting 33 5.58 -0.26 31 5.06 -0.71 17 6.18 0.34
    FRANKLIN: Disk configuration and I/O performance 98 5.93 0.33 118 6.06 0.46 52 5.90 0.30
    FRANKLIN: Ability to run interactively 76 5.78 0.02 94 6.02 0.26 49 5.96 0.20
    DaVinci: Disk configuration and I/O performance 12 5.58 -0.76 11 6.00 -0.34 9 6.11 -0.23
    FRANKLIN: Batch queue structure 103 5.56 -0.34 127 5.76 -0.15 54 5.89 -0.01
    NERSC SW: Data analysis software 43 5.42 -0.42 44 4.98 -0.86 31 5.84 -0.00
    NERSC SW: Visualization software 47 5.53 -0.38 49 5.06 -0.85 30 5.77 -0.15
    NERSC SW: ACTS Collection 27 5.48 -0.45 37 5.19 -0.75 20 5.65 -0.29
    FRANKLIN: Batch wait time 103 4.59 -0.96 130 4.78 -0.76 56 5.32 -0.23

     

    Changes in Satisfaction for Active MPP Respondents

    The table below includes only those users who have run batch jobs on the MPP systems. It does not include interactive-only users or project managers who do not compute. This group of users showed an increase in satisfaction for the NERSC web site.

    Item Num who rated this item as: Total Responses Average Score Std. Dev. Change from 2009
    1 2 3 4 5 6 7
    CONSULT: Overall

    1 4 9 51 195 260 6.67 0.66 0.15
    PROJECT: Overall


    3 4 26 66 99 6.57 0.72 0.26
    Security

    2 9 10 57 176 254 6.56 0.80 0.16
    WEB: www.nersc.gov overall

    2 2 13 129 137 283 6.40 0.68 0.12
    OVERALL: Satisfaction with NERSC 1
    1 6 23 130 169 330 6.38 0.78 0.15
    WEB: Timeliness of information

    1 7 20 113 123 264 6.33 0.76 0.12
    SERVICES: Allocations process 1
    2 9 22 97 117 248 6.27 0.90 0.23
    OVERALL: Available Computing Hardware
    1 7 9 29 147 136 329 6.19 0.90 0.19
    WEB: Ease of finding information

    5 10 27 122 103 267 6.15 0.89 0.20
    FRANKLIN: Overall
    1 6 10 35 135 109 296 6.11 0.92 0.36
    FRANKLIN: Uptime (Availability)
    3 13 11 41 114 109 291 5.98 1.11 1.07
    FRANKLIN: Disk configuration and I/O performance 1
    3 34 22 109 99 268 5.98 1.08 0.38
    FRANKLIN: Batch queue structure 3 2 8 36 43 112 80 284 5.71 1.22 -0.19
    SERVICES: Ability to perform data analysis 1 2 3 17 17 34 32 106 5.61 1.33 -0.32
    NERSC SW: Visualization software 3 3 6 28 14 32 40 126 5.40 1.54 -0.51
    NERSC SW: ACTS Collection 2

    29 9 18 26 84 5.39 1.43 -0.54
    NERSC SW: Data analysis software 3 1 3 34 11 34 32 118 5.36 1.46 -0.47
    FRANKLIN: Batch wait time 7 10 43 42 79 82 26 289 4.82 1.43 -0.73

     

    Changes in Satisfaction for PDSF Respondents

    The PDSF users are clearly more satisfied with data analysis resources than the MPP users.

    Item Num who rated this item as: Total Responses Average Score Std. Dev. Change from 2009
    1 2 3 4 5 6 7
    PROJECT: I/O Bandwidth




    1 4 5 6.80 0.45 0.56
    NETWORK: Remote network performance to/from NERSC (e.g. Hopper to your home institution)




    3 4 7 6.57 0.53 0.42
    SERVICES: Ability to perform data analysis




    4 5 9 6.56 0.53 0.62
    SERVICES: Allocations process



    1 4 6 11 6.45 0.69 0.42
    SERVICES: Data analysis and visualization assistance



    2 1 6 9 6.44 0.88 0.61
    OVERALL: Available Computing Hardware



    2 9 12 23 6.43 0.66 0.43
    NERSC SW: Performance and debugging tools




    8 2 10 6.20 0.42 0.33
    WEB: www.nersc.gov overall
    1 1
    1 6 2 11 5.45 1.57 -0.82
    PDSF SW: STAR

    2 1 3 4 3 13 5.38 1.39 -0.80
    TRAINING: New User's Guide 1

    1 2 4 1 9 5.11 1.76 -1.03
    WEB: Ease of finding information
    1 1
    3 6
    11 5.09 1.38 -0.86

     

    Survey Results Lead to Changes at NERSC

    Every year we institute changes based on the previous year survey. In 2009 and early 2010 NERSC took a number of actions in response to suggestions from the 2008/2009 user survey.

     

    Users Provide Overall Comments about NERSC

    132 users answered the question What does NERSC do well?

    Some representative comments are:

    User support is fantastic - timely and knowledgeable, including follow-up service. New machines are installed often, and they are state-of-the-art. The queues are crowded but fairly managed.
    Everything that is important for me. This is a model for how a computer user facility should operate.
    User support is very good. Diversity of computational resources.
    Website is first class, especially the clear instructions for compiling and running jobs.
    very good account management tools (nim), good software support
    HPSS is fast.
    I really like the new machine Carver. It is efficient.
    The account allocation process is very fast and efficient.
    NERSC has always been user centered - I have consistently been impressed by this.
    NERSC has proven extremely effective for running high resolution models of the earth's climate that require a large number of processors. Without NERSC I never would have been able to run these simulations at such a high resolution to predict future climate. Many thanks.
    We run data intensive jobs, and the network to batch nodes is great! The access to processors is also great.

    105 users responded to What can NERSC do to make you more productive? 

    The top areas of concern were long queue turnaround times, the need for more computing resources, queue policies, and software support. Some of the comments from this section are:

    There are a lot of users (it is good to be useful and popular), and the price of that success is long queues that lead to slower turn-around. A long-standing problem with no easy answer.
    Add more processors to carver. The long queue time makes progress slow. Carver clobbers hopper and franklin with the performance increase of my code. Also recompiling the code is much faster on carver. Yet because i have to wait longer to start and restart the simulations, it doesn't get me results faster overall.
    Turn around time is always an issue! More resources would be great!
    Add a machine with longer wall-clock limit and less core/node (to save allocation). Not all users have to chase ultra massive parallelization for their research.
    Allow longer jobs (such as one month or half year) on Carver and Hoppers. Let science run to the course.
    I often have a spectrum of job sizes to run (e.g., scaling studies, debugging and production runs) and the queuing structure/algorithms seem to be preferential to large runs. It would improve my productivity if this was more balanced or if there were nearly identical systems which had complementary queuing policies.
    Provide an interface which can help the user determine which of the NERSC machines is more appropriate at a given time for running a job based on the number of processors and runtime that are requested.
    During the working day, i would always encourage the availability of more development nodes over production ones.
    Flexibility of creating special queues for short term intensive use without extended waiting time. Hopefully it will not be too expensive, either.
    Better ability to manage group permissions and priorities for jobs, files, etc. The functionality of the idea of project accounts is still relevant.
    t would help developers if a more robust suite of profiling tools were available. For example there are some very good profiling tools for franklin, but they are not robust enough to analyze a very large program.
    Allow subproject management and helpful issue tracker (wikis, as well) ala github.com or bitbucket.org
    I could possibly use some more web-based tutorials on various topics: MPI programming, data analysis with NERSC tools, a tutorial on getting Visit (visualization tool) to work on my Linux machine.
    I also need a better way to perform remote visualization on Euclid with Visit.
    Increase home and scratch area quota in general. Lot of time is wasted in managing the scratch space and archiving and storing the data.
    Improve the performance and scaling of file I/O, preferably via HDF5.
    HPSS should allow users to view the file contents. Add a "less" "more" there. At present, I have to transfer files back to franklin and view to see whether those are the files that I need.
    Make hsi/htar software available on a wider variety of Linux distributions.
    Keep the supercomputers up more. Make them more stable. Reduce the variability in the wallclock times for identical jobs.
    If in future one can run jobs with more Memory than available at present , researchers in general would benefit tremendously.
    Allow more than 3 password tries before locking someone out
    If scheduled maintenance was at the weekend that would make my work more productive.

    15 users responded to If there is anything important to you that is not covered in this survey, please tell us about it.