2021 User Survey Results

Methodology

NERSC conducts an annual user survey to collect feedback on the quality of its services and computational resources. The user survey was first conducted in 1998 and underwent significant revisions in 1999 and 2018. Since 2019 we have outsourced the survey process to the National Business Research Institute (NBRI), a global research and consulting organization with expertise in customer surveys.

Outsourcing the annual survey has reaped several benefits:

Advice on survey design adjustments to follow best practices.
Expert analysis of the survey results, including root-cause analysis identifying which survey topics had the most impact on the overall satisfaction scores; and text analytics identifying common positive and negative themes in free-form comment-type questions.
Increased user participation. In past years, before using NBRI, NERSC needed to send several email reminders, including targeting users, to reach our target of responses from 10% of the user community. Since 2019 we have received responses from more than 10% of NERSC users without needing to repeatedly remind users. (See Figure 1.)

Figure 1. Outsourcing the user survey to NBRI and making it shorter starting in 2018 helped to arrest a downward trend in response rates.

NERSC aims to receive survey responses from 10% of active users and representing 50% of the NERSC-hour usage, to ensure that users who most closely interact with NERSC systems and services are sufficiently represented and that the survey responses reflect the impact of NERSC operations.

The 2021 User Survey ran from November 16, 2021 to January 25, 2022. The survey was sent to 8,778 active users and received responses from 926 of them, representing 10.5% of active users and 39% of the total charged hours in 2021. NBRI found that results of the survey reach a 99.86% confidence level with a 5% sampling error. In contrast to previous years where the definition of user and identity management systems had changed, the increase in user survey responses is fully a result of the 8% increase in the number of NERSC users.

The survey responses represented a smaller fraction of NERSC usage than in previous years, with the likely cause for this being a flatter distribution of allocated hours. The projects of which users who responded to the survey were a part represented more than 80% of all NERSC hours used, so we are confident that survey responses were representative of most of our active user base.

Survey sent to 8,776 active users	2020 Target	2020 Actual	2021 Target	2021 Actual
Number of users surveyed	8,118	8,118	8,776	8,776
Total number of responses	812	1,010	877	926
% of all active users responding	10%	12.4%	10%	10.5%
% of NERSC hours represented by survey respondents	50%	51.2%	50%	39%

Table 1. 2021 NERSC user survey response rate.

Survey Design and Methods

Since 2019, the survey has used a six-point scale from “very dissatisfied” to “very satisfied” (Figure 2). NBRI advised that the six-point scale is a survey best-practice. By disallowing a completely neutral response, a better understanding of user sentiment can be obtained.

Figure 2. A sample of the 2021 survey. Users select their sentiment on each item by clicking the appropriate phrase.

Results

The overall satisfaction scores for 2021 are shown in Table 2.

Survey Area	2020 Target	2020 Actual	2021 Target	2021 Actual	% responses exceeding target
Overall Satisfaction	4.5	5.28	4.5	5.45	91%
Services Overall	4.5	5.42	4.5	5.55	93%
Average of user support ratings	4.5	5.35	4.5	5.43	88%

Table 2. Overall 2021 NERSC user satisfaction ratings.

The overall satisfaction scores were well above target and higher than 2020 scores with minor variations. None of these changes in the scores were statistically significant. The average satisfaction scores in each category remained well above the minimum satisfaction target, as shown in Figure 4.

Figure 3. Overall satisfaction metric over time since the survey was first implemented. Scores before 2019 (left of the dotted line) are adjusted from the previous 7-point scale to the current 6-point scale. The algorithm used to adjust from the 7-point scale to the 6-point scale was described in the Operational Assessment report for 2019.

Figure 4. Average satisfaction scores for each category.

Figure 5. Average scores for all quantitative survey questions were high and above target.

Factors Affecting Satisfaction

The survey analysis from NBRI identified a number of high-impact themes across the quantitative and qualitative survey questions – that is, survey questions and user comments that had high impact on the overall satisfaction rating. The highest impact categories for NERSC user satisfaction, as identified by NBRI, are unchanged from the previous year survey. NERSC services and computing resources have the most significant impact on overall satisfaction ratings, with some specific high-impact themes being:

Computational Resources: While users always want more resources, NERSC users are positive about the quality and diversity of resources NERSC provides. Some users did report finding downtimes due to scheduled maintenance or unscheduled events too disruptive.
Technical Support: The ability of NERSC staff to quickly and effectively resolve technical issues had a positive impact on overall satisfaction, according to NBRI’s statistical analysis.
Documentation: NERSC's documentation is considered high quality; however, some users also noted difficulties finding the information they need. NBRI identified opportunities to increase user satisfaction by developing more documentation targeting novice users.
Queue Time: Queue times were cited as an area users would like to see improved, but was only weakly correlated with lower overall satisfaction. This suggests that users understand that NERSC computational resources are in high demand.

Understanding and Addressing Dissatisfaction in Users

We aim each year to identify and address sources of dissatisfaction. The primary sources of dissatisfaction identified in 2020 for attention in 2021 were the frequency of downtimes and a desire for expanded and improved documentation.

We identified three factors as likely contributors to the perception of high downtime in 2020:

Some scheduled but multi-day outages as the NERSC facility electrical power was upgraded in preparation for Perlmutter
Two multi-day outages – one scheduled and one unscheduled – relating to the Lustre filesystem
The fact that in 2020 NERSC had only one computational system (Cori), so users had no second system to use when Cori was unavailable due to scheduled maintenances.

In 2021 Cori had no multi-day outages, and for the latter part of the year Perlmutter was available to a growing number of early-access users, so the likely-contributing factors to the perception of high downtime in 2020 were mostly absent. Supporting that hypothesis, downtime was not identified as a major cause of dissatisfaction in 2021.

NERSC documentation continues to be highly impactful. Users both praised it for its quality and coverage and indicated a desire for more and better documentation, especially for novice users.

To better target novice users, NERSC began a service of one-to-one appointments for certain help topics. This was well received, with the "NERSC 101" topic being the most popular.

The key themes for user dissatisfaction in 2021 were queue times and documentation.

Queue Times

Job queue times are a perennial factor in user dissatisfaction and driven mostly by the high demand for compute resources. While long queue times were cited by users who rated NERSC lower on overall satisfaction, NERSC's computational resources had a positive impact on overall satisfaction when all survey responses were considered.

NERSC has implemented several capabilities, such as flexible-time jobs, to shorten queue times, and the overall system utilization for Cori in 2021 was more than 93%. In 2022, the additional resources of Perlmutter will be available, and we expect this will also help meet the demand for computational resources.

Improving Documentation

NERSC continues to expand and improve its documentation, and throughout 2021 supplemented this with an appointment system for one-to-one user support. In 2021, we created a documentation task force and a new-user experience task force to determine where gaps existed in our documentation and how to improve the new-user experience. There was some overlap in these two areas, and we applied feedback from the commenters about how we could improve our documentation to address both. In particular, we began to develop some tutorials that could be helpful to new users; work is ongoing in this area. NERSC’s user documentation is kept in a GitLab repository, and in 2021 there were 425 merge requests (suggested changes to the documentation) that were merged to the repository.

Figure 6. NERSC documentation is actively updated. The increase in merge requests toward the end of the year corresponds to when Perlmutter became available to an increased number of early-access users, and NERSC documentation was kept up to date with the still-frequently-changing system.

User Comments

The three free-form questions in the user survey gives users a chance to comment on issues important to them. Users are asked “What can NERSC do to serve you better?”, “What does NERSC do well?” and “Is there any other issue on which you would like to comment?”

What does NERSC do well?

We received 412 responses to the question “What does NERSC do well?” There were a few broad categories that users called out the most:

Consulting, account support, and training
Providing high-quality, usable resources
Documentation
Software
Communication with users.

Sample user comments

NERSC is great at supporting a broad range of users to do capacity-class computing. In general, they do a good job of making HPC as painless as possible.
NERSC is the model for a computing service. Given the number of users/software they support they do an outstanding job! They are dependable (within the limits of supporting a large computing infrastructure), they provide outstanding support and a short resolution time to requests. I am overall extremely thankful for their services.
Trainings! This is one of the main reasons I continue to work through NERSC. The documentation, trainings, and support are all top notch.
It is impressive that NERSC supports Cori Haswell, KNL and CGPU, and now Perlmutter -- the workload for maintaining these machines must be immense! The volume of software supported on Cori is enough to run almost any application.
Great user support, reliable hardware, timely and helpful communication, predictable operation and many opportunities to learn.

What can NERSC do to serve you better?

This question elicited 344 responses. The most common were related to:

Reduce NERSC system maintenance periods
Reduce queue wait times
Fewer storage system interruptions
Improve documentation
Provide more CPU resources.

Sample user comments

Sometimes the Cori Haswell queue is very busy, job waiting time is a bit long (over 3 days). Shorter waiting time will always be appreciated.
Reduce maintenance times (ideally maintenance that doesn't disturb site availability)
Better (stable and fast) storage access. It would be very helpful if there are (email) notifications before removal of big data on the $SCRATCH quota.
The consistency of documentation could be improved. Even infrequent disagreement in instructions from different sources at NERSC can be extremely confusing. Clearing up such confusion costs a great deal of time.
Please keep a significant CPU component, not all large parallel scientific codes can run on hybrid or pure GPU architectures without significant effort.

Is there any other issue on which you would like to comment?

Comments in this section largely reflected those in the previous two, along with either requests or thanks for specific resources and software NERSC provided.