Show All | 1 2 3 4 5 ... 10 | Next »

2007/2008 User Survey Results

Response Summary

Many thanks to the 467 users who responded to this year's User Survey. The response rate has significantly increased from previous years:

70 percent of users who had used more than 1 million MPP hours when the survey opened responded
43 percent of users who had used between 10,000 and 1 million MPP hours responded
The overall response rate for the 2,804 authorized users during the survey period was 16.3%.

The respondents represent all six DOE Science Offices and a variety of home institutions: see Respondent Demographics.

The survey responses provide feedback about every aspect of NERSC's operation, help us judge the quality of our services, give DOE information on how well NERSC is doing, and point us to areas we can improve. The survey results are listed below.

You can see the 2007/2008 User Survey text, in which users rated us on a 7-point satisfaction scale. Some areas were also rated on a 3-point importance scale or a 3-point usefulness scale.

Satisfaction Score	Meaning	Number of Times Selected
7	Very Satisfied	9,486
6	Mostly Satisfied	6,886
5	Somewhat Satisfied	1,682
4	Neutral	1,432
3	Somewhat Dissatisfied	485
2	Mostly Dissatisfied	130
1	Very Dissatisfied	81

Importance Score	Meaning
3	Very Important
2	Somewhat Important
1	Not Important

Usefulness Score	Meaning
3	Very Useful
2	Somewhat Useful
1	Not at All Useful

The average satisfaction scores from this year's survey ranged from a high of 6.71 (very satisfied) to a low of 4.46 (neutral). Across 128 questions, users chose the Very Satisfied rating 9,486 times, and the Very Dissatisfied rating 81 times. The scores for all questions averaged 6.07, and the average score for overall satisfaction with NERSC was 6.3. See All Satisfaction Ratings.

For questions that spanned the 2007/2008 through 2003 surveys, the change in rating was tested for significance (using the t test at the 90% confidence level). Significant increases in satisfaction are shown in blue; significant decreases in satisfaction are shown in red.

Significance of Change
significant increase (change from 2006)
significant decrease (change from 2006)
not significant

Areas with the highest user satisfaction include account support, the NERSC Global Filesystem, the HPSS mass storage system, consulting services, network performance within the NERSC center, and up times for the Jacquard, Seaborg and Bassi systems.

7=Very satisfied, 6=Mostly satisfied, 5=Somewhat satisfied, 4=Neutral, 3=Somewhat dissatisfied, 2=Mostly dissatisfied, 1=Very dissatisfied

Item	Num who rated this item as:							Total Responses	Average Score	Std. Dev.	Change from 2006
Item	1	2	3	4	5	6	7	Total Responses	Average Score	Std. Dev.	Change from 2006
SERVICES: Account support			2	1	6	82	265	356	6.71	0.57	0.07
NGF: Reliability				1	1	16	47	65	6.68	0.59	0.25
NGF: Uptime				1	1	17	47	66	6.67	0.59	0.32
HPSS: Reliability (data integrity)			1	3	4	29	111	148	6.66	0.70	-0.04
OVERALL: Consulting and Support Services			3	9	13	91	310	426	6.63	0.71	0.11
Network performance within NERSC (e.g. Seaborg to HPSS)				4	4	49	111	168	6.59	0.66	0.06
CONSULT: overall			4	8	7	102	241	362	6.57	0.74	0.10
CONSULT: Timely initial response to consulting questions	1	1	2	4	11	107	229	355	6.55	0.77	-0.02
Jacquard: Uptime (Availability)				4	6	34	82	126	6.54	0.73	-0.04
HPSS: Uptime (Availability)			1	3	6	45	96	151	6.54	0.73	-0.08
Seaborg: Uptime (Availability)				4	7	38	89	138	6.54	0.73	0.30
Bassi: Uptime (Availability)	1		1	5	6	48	122	183	6.54	0.84	0.13

Areas with the lowest user satisfaction include batch wait times for Bassi and Jacquard, Franklin availability and I/O, training classes and data analysis and visualization services.

7=Very satisfied, 6=Mostly satisfied, 5=Somewhat satisfied, 4=Neutral, 3=Somewhat dissatisfied, 2=Mostly dissatisfied, 1=Very dissatisfied

Item	Num who rated this item as:							Total Responses	Average Score	Std. Dev.	Change from 2006
Item	1	2	3	4	5	6	7	Total Responses	Average Score	Std. Dev.	Change from 2006
OVERALL: Data analysis and visualization facilities			6	68	28	67	62	231	5.48	1.24	0.11
Jacquard: Batch wait time	2	3	13	6	28	40	34	126	5.47	1.46	-0.40
TRAINING: NERSC classes: in-person			2	20	3	8	18	51	5.39	1.42	-0.55
Seaborg SW: Visualization software	1		1	10	4	12	9	37	5.38	1.42	-0.07
Bassi SW: Visualization software			2	16	4	16	11	49	5.37	1.27	0.00
Franklin SW: Visualization software	1		3	19	7	19	16	65	5.34	1.38
Live classes on the web		1		23	6	16	14	60	5.30	1.29	-0.46
Franklin: Disk configuration and I/O performance	8	11	20	36	34	73	51	233	5.15	1.63
Franklin: Uptime (Availability)	7	11	48	12	46	86	47	257	5.04	1.64
Bassi: Batch wait time	11	19	36	16	33	46	22	183	4.46	1.80	-1.39

The largest increases in satisfaction over last year's survey are for the now retired Seaborg IBM POWER3 system, for computer and network operations 24 by 7 support, and for the software available on our systems.

7=Very satisfied, 6=Mostly satisfied, 5=Somewhat satisfied, 4=Neutral, 3=Somewhat dissatisfied, 2=Mostly dissatisfied, 1=Very dissatisfied

Item	Num who rated this item as:							Total Responses	Average Score	Std. Dev.	Change from 2006
Item	1	2	3	4	5	6	7	Total Responses	Average Score	Std. Dev.	Change from 2006
Seaborg: Batch wait time	2	2	8	12	33	47	34	138	5.53	1.32	0.59
SERVICES: Computer and network operations support (24x7)			2	11	9	46	93	161	6.35	0.95	0.31
Seaborg: Uptime (Availability)				4	7	38	89	138	6.54	0.73	0.30
Seaborg: overall				5	12	55	71	143	6.34	0.78	0.25
OVERALL: Available Software			3	19	43	157	176	398	6.22	0.87	0.24

The largest decreases in satisfaction over last year's survey are shown below.

7=Very satisfied, 6=Mostly satisfied, 5=Somewhat satisfied, 4=Neutral, 3=Somewhat dissatisfied, 2=Mostly dissatisfied, 1=Very dissatisfied

Item	Num who rated this item as:							Total Responses	Average Score	Std. Dev.	Change from 2006
Item	1	2	3	4	5	6	7	Total Responses	Average Score	Std. Dev.	Change from 2006
Bassi: Batch wait time	11	19	36	16	33	46	22	183	4.46	1.80	-1.39
Jacquard SW: Visualization software				11	4	13	10	38	5.58	1.18	-0.54
Jacquard: Batch wait time	2	3	13	6	28	40	34	126	5.47	1.46	-0.40
Bassi: Batch queue structure	2	2	9	26	25	66	46	176	5.57	1.32	-0.35
Bassi: overall	1	1	7	7	30	74	67	187	5.96	1.11	-0.30

Survey Results Lead to Changes at NERSC

Every year we institute changes based on the previous year survey. In 2007 and early 2008 NERSC took a number of actions in response to suggestions from the 2006 user survey.

2006 user survey: On the 2006 survey four users had concerns about the MOTD not being updated fast enough after status changes or that it was too long.
NERSC response: The computer and network operations support has streamlined their procedures for managing status changes during outages, thus giving users more current status information both in the MOTD and by email (for users registered to receive status informational emails). The operations staff also increased their knowledge of account support procedures in order to provide better off hours support. The satisfaction score for Computer and network operations (24x7) support had a significant increase of .3 points over last year's score.
2006 user survey: On the 2006 survey a number of users requested longer wall times for the largest machines.
NERSC response: In January the wall time for Franklin's regular queues was increased from 12 hours to 24, and then to 36 hours in May. The satisfaction score for Franklin queue structure on the 2007/2008 survey was 6.03 out of 7.
2006 user survey: On the 2006 survey a number of users commented on poor reliability for PDSF disks, and the satisfaction score for PDSF disks had the lowest PDSF hardware rating (5.1).
NERSC response: NERSC has retired over 90 percent of the old NFS disk vaults and has installed new fiber channel based storage with better failover capabilities. In 2007/2008 the satisfaction score for PDSF disks increased to 5.54.
2006 user survey: On the 2006 survey a number of users requested more resources for interactive and debug jobs.
NERSC response: NERSC now reserves nodes for interactive/debug jobs on weekends; previously this was done Monday through Friday. This change did not change the satisfaction ratings significantly.
2006 user survey: On the 2006 survey users asked that we provide more cycles and Get Franklin online ASAP.
NERSC response: Franklin was delivered in January and February 2007. Initially installed with Catamount on the compute nodes, NERSC and Cray decided to install, and successfully tested for production use, an early release of Compute Node Linux (CNL). Early users started using Franklin with CNL in July 2007, and all users had access in September. Franklin was accepted in late October, 2007. Since then NERSC and Cray have worked together to improve system stability, I/O performance and the user environment. In July 2008 NERSC and Cray began upgrading Franklin's compute nodes from dual core to quad core, doubling both the number of cores and the amount of memory. We are looking forward to future enhancements, such as integrating the compute nodes with the NERSC Global Filesystem.

Users are invited to provide overall comments about NERSC:

150 users answered the question What does NERSC do well?

103 respondents stated that NERSC gives them access to powerful computing resources without which they could not do their science;
59 mentioned excellent support services and NERSC's responsive staff;
24 highlighted good software support or an easy to use user environment;
17 pointed to good queue management or job turnaround;
15 mentioned data services (HPSS, large disk space, purge policy, NGF, data analysis).

Some representative comments are:

The NERSC facility is fantastic. I'm very pleased with the hardware available, the people, the help, and the queues.

NERSC is generally excellent, and has both leadership computing power and good ease of use, increasing productivity. This is most of all because the staff are very responsive to user needs and are effective in making leadership class machines work well for user applications. Additionally, the queue structure is clear and usable, the networking is very good, and the storage resources are adequate for large jobs. The analytics and visualization programs and associated software support are very important.

Good computing. Good storage. We always need more.

What NERSC is best at is the combination of large-scale computing facilities with more flexible queuing policies than in other comparable facilities. Also the existence of "small-scale supercomputers" (Jacquard) is very useful to make tests.

NERSC is excellent. Franklin is a great resource - lots of cores. The waiting of queues for large core runs is very nice. [Obviously there is a wait time for 16384 core run for 36 hours :) ]

NERSC does customer service very well. I am always pleased whenever I deal with NERSC. I would also say that NERSC's infrastructure for users is very helpful.

108 users responded to What should NERSC do differently?.

The top three areas of concern were Franklin stability and performance, job scheduling and resource allocation policies, and the need for more or different hardware resources. NERSC will analyze these comments and implement changes where possible over the next year.

Some of the comments from this section are:

It would be great if NERSC could magically improve the stability of Franklin... Unfortunately, hardware failures increase with the size and complexity of the system.

Need to improve network and load management on the log in nodes for Franklin. At times it is very difficult to get any work done since the response time is so slow.

Providing for long serial queues (~12 hours) and enabling these for applications such as IDL would further improve the usefulness of Franklin in particular. We appreciate your efforts to do this and look forward to finding a solution with you soon.

Less emphasis on INCITE, special users. More emphasis on providing high throughput for production applications.

As computing clusters grow, it would be very interesting/helpful for NERSC to invest in robust queuing systems such as Google's MapReduce model. It seems that all of NERSC's clusters are based upon the premise that failures are abnormal and can be dealt with as a special case. As clusters and job sizes grow, single point failures can really mess up a massively parallel job (Franklin) or a large number of parallel jobs (bad nodes draining queues on PDSF). Companies like Google have succeeded with their computing clusters by starting with the premise that hardware failures will happen regularly and building queuing systems that can automatically heal, rather than relying upon the users to notice that jobs are failing, stop them, alert the help system, wait for a fix, and then resubmit jobs.

I would suggest doing more to discourage single node and small jobs

NERSC's seaborg was a great success because of its reliability and its large amount of per-node memory. That led to the situations that majority of scientific codes ran well on it. The future computer (NERSC6) shall have a configuration with large amount of per-node memory (similar to bassi or larger, but with larger amount CPUs than bassi has).

Bassi has become so busy as to be almost useless to me.

NERSC should tell more about their strategic plans. Hopefully in three years we will be operating differently than we are now (command line submission, manual data management etc.) Is NERSC going to actively help with this, or simply be a resource provider? Is NERSC going to help campaign to get better performance and resiliency tools (fault tolerance) actually put into production vs being left as academic demos?

More disk space to users. The whole point of having a LARGE cluster is to do LARGE simulations. That means LARGE amounts of data. We should get more storage space (individually).

104 users answered the question How does NERSC compare to other centers you have used? 61 users stated that NERSC was an excellent center or was better than other centers they have used. Reasons given for preferring NERSC include its consulting services and responsiveness, its security, and its queue management.

25 users said that NERSC was comparable to other centers or gave a mixed review and 11 said that NERSC was not as good as another center they had used. Some users expressed dissatisfaction with user support, with available disk space or with queue turnaround time.

Here are the survey results:

Respondent Demographics
Overall Satisfaction and Importance
All Satisfaction, Importance and Usefulness Ratings
Hardware Resources
Software
Visualization and Data Analysis
HPC Consulting
Services and Communications
Web Interfaces
Comments about NERSC

Respondents by DOE Office and User Role:

Office	Respondents	Percent
ASCR	53	11.3%
BER	55	11.8%
BES	133	28.5%
FES	64	13.7%
HEP	58	12.4%
NP	104	22.3%

User Role	Number	Percent
Principal Investigators	71	15.2%
PI Proxies	63	13.5%
Project Managers	7	1.5%
Users	326	69.8%

Respondents by Organization:

Organization Type	Number	Percent
Universities	275	58.9%
DOE Labs	145	31.0%
Other Govt Labs	26	5.6%
Industry	16	3.4%
Private labs	5	1.1%
Organization	Number	Percent
Berkeley Lab	70	15.0%
UC Berkeley	30	6.4%
Oak Ridge	14	3.0%
PNNL	12	2.6%
NREL	11	2.4%
Tech-X Corp	11	2.4%
U. Washington	11	2.4%
U. Wisconsin	11	2.4%
UC Davis	11	2.4%
NCAR	8	1.7%
U. Tennessee	8	1.7%
Argonne	6	1.3%
Livermore	6	1.3%
PPPL	6	1.3%
U. Colorado	6	1.3%
U. Maryland	6	1.3%
SLAC	5	1.1%
Colorado State	5	1.1%
Ohio State	5	1.1%
Stanford	5	1.1%
Texas A&M	5	1.1%
U. Chicago	5	1.1%
U. Illinois	5	1.1%
U. Oklahoma	5	1.1%
Vanderbilt	5	1.1%

Organization	Number	Percent
Brookhaven	4	0.9%
Auburn Univ	4	0.9%
MIT	4	0.9%
Rice University	4	0.9%
U. Michigan	4	0.9%
UC Irvine	4	0.9%
UCLA	4	0.9%
APC Lab Astro France	3	0.6%
Cal Tech	3	0.6%
Georgia State	3	0.6%
Harvard	6	2.3%
Jefferson Lab	3	0.6%
Los Alamos Lab	3	0.6%
Louisiana State	3	0.6%
Northwestern	3	0.6%
Princeton	3	0.6%
U. Texas	3	0.6%
UC Santa Barbara	3	0.6%
William & Mary	3	0.6%
NASA GISS	3	0.6%
Other Universities	95	20.3%
Other Gov. Labs	15	3.2%
Other DOE Labs	5	1.1%
Other Industry	5	1.1%
Private labs	5	1.1%

Which NERSC resources do you use?

Resource	Responses	Percent	Num who answered questions on this topic	Percent
NERSC Information Management (NIM) System	280	60.0%	363	77.7%
NERSC web site (www.nersc.gov)	278	59.5%	383	82.0%
Cray XT4 Franklin	268	57.4%	293	62.7%
IBM POWER5 Bassi	214	45.8%	225	48.2%
Linux Cluster Jacquard	170	36.4%	185	39.6%
HPSS Mass Storage System	164	35.1%	195	41.8
Consulting services	164	35.1%	362	77.5%
IBM POWER3 Seaborg (now retired)	147	31.5%	168	36.0%
Account support services	127	27.2%	356	76.2%
PDSF Cluster	75	16.1%	95	20.3
DaVinci	66	14.1%	115	24.6%
Off-hours 24x7 Computer and Network Operations support	47	10.1%	161	34.5%
NERSC Global Filesystem (NGF)	24	5.1%	73	15.6%
Visualization services	15	3.2%	71	15.2%
NERSC CVS server	11	2.4%	94	20.1%
Grid services	8	1.7%	42	9.0%

How long have you used NERSC?

Time	Number	Percent
less than 6 months	89	19.4%
6 months - 3 years	199	43.4%
more than 3 years	171	37.3%

What desktop systems do you use to connect to NERSC?

System	Responses
Unix Total	347
PC Total	232
Mac Total	207
Linux	316
OS X	206
Windows XP	181
Windows Vista	41
Sun Solaris	23
Windows 2000	10
IBM AIX	3
HP HPUX	2
MacOS	1
SGI IRIX	1

Web Browser Used to Take Survey:

Browser	Number	Percent
Firefox 2	202	43.3%
Safari	79	16.9%
MSIE 7	54	11.6%
Firefox 3	52	11.1%
Firefox 1	35	7.5%
Mozilla	25	5.4%
MSIE 6	17	3.6%
Opera	3	0.6%

Operating System Used to Take Survey:

OS	Number	Percent
Mac OS X	170	36.4%
Linux	138	29.6%
Windows XP	128	27.4%
Windows Vista	20	4.3%
Windows Server 2003	5	1.1%
Windows 2000	3	0.6%
SunOS	3	0.6%

Show All | 1 2 3 4 5 ... 10 | Next »

2007/2008 User Survey Results

Table of Contents

Response Summary

Survey Results Lead to Changes at NERSC

Users are invited to provide overall comments about NERSC:

Here are the survey results:

Respondents by DOE Office and User Role:

Respondents by Organization:

Which NERSC resources do you use?

How long have you used NERSC?

What desktop systems do you use to connect to NERSC?

Web Browser Used to Take Survey:

Operating System Used to Take Survey: