2018 User Survey Results

Methodology

NERSC conducts a yearly user survey to collect feedback on the quality of our services and computational resources. The user survey was first conducted in 1998; significant revisions occurred in 1999 and again for 2018, which are described in this section.

NERSC aims to receive survey responses from about 10% of active users and representing at least 50% of the NERSC-Hour usage, to ensure that users who most closely interact NERSC systems and services are sufficiently represented and that the survey responses reflect the impact of NERSC operations. To reduce the burden on users filling out the survey, the 2018 User Survey had only 23 questions, significantly shorter than previous years, while retaining the same high-level questions that would provide continuity from previous years. The 23 questions included 20 ratings and three free-form feedback questions.

The 2018 User Survey ran from December 12, 2018 to February 11, 2019. On December 12, NERSC had 6,964 active users based on the definition given above and were thus able to respond to the user survey. As shown in Table 1, the 2018 survey had responses from 8.3% of those users, representing 56.6% of the NERSC Hours used. While this is slightly below target for all active users, the survey did achieve an increase in the percentage of NERSC-Hours represented, and if we consider only users who ran jobs on Cori or Edison, the response rate is 10.5% (448 of 4,285). (For 2019, NERSC has implemented a new vetting process for users and is also requiring that all users revalidate user information and turn on MFA. We expect these changes will reduce the difference between users who are enabled to run on the system and those who actually do.)

Survey sent to 8,026 active users	2017 Target	2017 Actual	2018 Target	2018 Actual
Total # of responses	648	656	696	577
% of all active users responding	10%	10.9%	10%	8.3%
% of users who ran jobs responding	N/A	11.9%	N/A	10.5%
% of NERSC-Hours represented by survey respondents	50%	54.0%	50%	56.6%

Table 1. User survey response rate.

The survey uses a seven-point rating scale, where “1” is “very dissatisfied” and “7” indicates “very satisfied.” For each question, the average score and standard deviation are computed. The change in score compared to the previous year is considered significant if it passes the standard t-test criterion at the 95% confidence level.

Results

Historically the average score for overall satisfaction has been in the "very satisfied" range, as can be seen in Figure 1, and this was true for 2018. Table 2 shows that satisfaction scores were slightly lower in 2018 than 2017, though still well above the target of 5.25 in all categories. Figure 2 shows the distribution of satisfaction scores in more detail.

Figure 1. Overall satisfaction metric over time, since the survey was started.

Survey Area	2017 Target	2017 Actual	2018 Target	2018 Actual	Significant Change
Overall Satisfaction	5.25	6.38	5.25	6.22	-0.16
Avg. of User Support ratings (Services Overall)	5.25	6.42	5.25	6.36	-0.14

Table 2. Overall satisfaction ratings.

Figure 2. User survey overall satisfaction metrics.

Factors Affecting Satisfaction

Overall satisfaction in 2018 remained high, with a score of 6.22. In addition, the high-level survey question areas on Services, Computing Resources, Data Resources, and Software all remained high and above the NERSC target. We can infer why each may have decreased slightly based on comments and feedback from users.

The survey solicits free-form comments about what NERSC does well and could do better, and key themes we see repeated in the comments are summarized in Table 3.

Factors Increasing Satisfaction	Factors Decreasing Satisfaction
Consulting support Web-based documentation Large-scale resources Provision of software System stability and uptime	Queue wait times Storage space limits Requirement to use MFA Compute time/allocation limits System stability and uptime

Table 3. Themes seen in survey comments sections.

The themes are mostly consistent with results from 2017 and previous years; users remain very happy with NERSC's user support and consulting, and appreciate the scale of resources NERSC provides but would like the scale to be larger still (more compute resources, more storage space).

A new development in 2018 was the rollout of multi-factor authentication (MFA) support on an opt-in basis, and at the beginning of 2019 -- while the survey was being run -- MFA became mandatory for most users. The additional required step at login time increases security but imposes an initial learning curve on users and is undeniably less convenient. While we do not know how this affected user scores quantitatively, the impact of requiring MFA was visible in user comments.

Some User Comments

The free-form questions in the user survey provide some insights into what drives users’ perceptions of NERSC. Users are asked “What can NERSC do to serve you better?” and “What does NERSC do well?” to gauge what users think we can improve and what we should continue doing. We reproduce some representative comments below.

Users greatly appreciated the efforts of NERSC consultants and account support:

Staff is very responsive and eager to ensure that the science gets done.
NERSC consultants are great; always professional, courteous, and very knowledgeable.
The availability of consulting and support staff makes the process of dealing with any problems generally quite smooth. To me, it seems that the people behind NERSC are the main strength of the system.
The NERSC staff have been exceptionally helpful to me. I am a new user this year and had many questions. The office-hour sessions were particularly useful.
We've been very pleased with the level of support we received from the help desk consultants this year. They were very responsive to our problems and concerns. Great customer service.
People and support -- every time I've interacted with NERSC staff, they've been knowledgeable and very helpful. One of the staff even helped me build VisIt from source this year! I also really appreciate the availability of statistical data like wait times on Cori.
Customer service. We have been working with NERSC for many years and all issues have always been handled quickly, fairly, and professionally.
User/customer support, ticket system. It is probably the best of any system anywhere (not just HPC).
NERSC staff have never failed to answer my questions or solve our user problems.

Users also appreciated NERSC’s documentation:

Excellent user guides on all of the provided tools.
Comprehensive documentation of the functionalities of the various computing resources on the NERSC website.
NERSC user-friendliness is, by far, the best. Documentation via the website, the implementation of the Jupyter notebook hub, and other things that just seem to add up to make NERSC more attractive.
The website is very complete and has enough tutorials and examples to learn about the resources and how to run the jobs.
NERSC computing resources are very well documented, which has been very helpful to me as a beginner.

They thought that NERSC’s communication, training, and outreach strategies were effective:

NERSC does an excellent job with providing information, access, and announcements with using the supercomputers. I always find their emails informative and with very useful information.
NERSC holds many seminars that have been helpful to me as a beginner.
Communication through the weekly emails, and I love the training/learning sessions.
Really enjoying the podcast, I listen to every one.
I appreciate the stable software environment and the level of communication about outages and changes from NERSC staff.

Users also appreciated the scale and stability of NERSC compute and storage systems and the innovations such as Spin, the interactive queue on Cori, and Jupyter:

NERSC serves its users well by providing large-scale resources dedicated to a user community with limited access to HPC at this scale. The computing environment is well organized and optimized software access is well-maintained.
NERSC has an incredible uptime combined with ease of use and reliability of the entire setup.
NERSC is an excellent place to perform large-scale high performance computing. The wait times are most often short.
>NERSC does an *excellent* job of maintaining HPC systems with up-to-date software, keeping the machines active and available for jobs most of the time.
NERSC has done a good job to give each user sufficient disk space to run their jobs. The HPSS system works very well.
Provides access to large compute resources that are well-maintained and supported.
Provides reasonably accessible high performance computing resources and, increasingly, growing support for a sophisticated set of modern data analysis tools. (jupyter-dev.nersc.gov has become central to my work; this is great!).
HPSS works very well for my data storage needs (100s TB). It is reliable and transfers are reasonably fast. In addition, I am happy with the NERSC data portal that hosts a portion of my data for public access.
Good availability outside of scheduled maintenance. New Spin service has great potential.
The interactive nodes are extremely helpful in debugging and data analysis.
Ecosystem that is more than just FLOPS and disk bandwidth: Jupyter, SPIN, science gateways, DTNs, cron jobs, workflow tools, interactive queue, software modules, etc.
The scratch-aware Jupyter notebook is super helpful for post-processing or data analysis.

While support for MFA was not universal, several users recognized the effort that went into making it as seamless as possible:

NERSC's recent work on user-friendly MFA is greatly appreciated.
I like the new MFA system for logging in. I have not had any issues using the supercomputing resources at NERSC.
sshproxy makes MFA much less painful, well done.
2fa (2 Factor Authentication) added this year is certainly more of a pain, but not so much of a pain that it hurts my work. Security tradeoff is more than worth it.

The largest criticism of NERSC was that the queue times were too long and more resources were needed. More than a third of the comments on what could NERSC do better concerned long queue wait times, low allocations, or other related issues.

It's hard to predict when a particular job might start running, which makes it difficult to plan out research projects sometimes. The mean times as a function of job length are somewhat helpful, but there are no discernible patterns there, and sometimes the wait times are counterintuitive (a longer job time does not always correspond to a longer wait time). Perhaps some more insight about wait times would be useful.
The primary challenge recently has been long queue wait times. Increasing the number of available compute nodes (over the long term) is really needed to keep up with user demand.
If possible, we would like to have more allocation for our group. Or it may be better for NERSC to increase it computing capability.

Finally, we did not solicit these comments:

NERSC is the dream platform for scaling up computations quickly and easily.
Your computing infrastructure is my favorite to use. Cori KNL is a game changer.
Everything. It is amazingly perfect for my computational needs.