NERSCPowering Scientific Discovery for 50 Years

2004 User Survey Results

Visualization and Data Analysis

Where do you perform data analysis and visualization of data produced at NERSC?

 

LocationResponsesPercent
All at NERSC 12 5.8%
Most at NERSC 33 16.0%
Half at NERSC, half elsewhere 40 19.4%
Most elsewhere 57 27.7%
All elsewhere 49 23.8%
I don't need data analysis or visualization 15 7.3%

Are your data analysis and visualization needs being met? In what ways do you make use of NERSC data analysis and visualization resources? In what ways should NERSC add to or improve these resources?

[Read all 80 responses]

 

20   Comments about Seaborg use
15   Don't use / don't need
11   Requests for additional services
10   Interactions with Visualization Group members
10   Do data analysis / visualization locally
8   Comments about PDSF use
7   Services meet our needs
1   Comments about the Math Server Newton

 

Comments about Seaborg use:   20 responses

Just need to have GrADs, NCL, NCO, ferret on Seaborg.

NERSC response: ferret and NCAR/NCL are installed on escher. NCAR/NCL is installed on seaborg. For the others, you may formally request that the center obtain and install software by completing the Software Request Form.

Yes-- no significant or unusual needs (mostly using idl)

Data analysis and visualization is fine. We use a custom-written IDL library for analysis, and seaborg has an adequate number of IDL licenses.

... I do some runtime visualization using standard pgn library.

I am a frequent user of the IDL software on seaborg. The service is basically flawless, very satisfactory, reliable and useful. The occasional problem is the lack of available licenses (which have been recently really sporadic). The other issue is the tight run time limit for the interactive jobs which applies also to IDL runs and prevents a full exploitation of the IDL capabilities in terms of the image processing, basic mathematical operations etc.

NERSC response: You may wish to consider using the visualization server escher (to be upgraded to DaVinci, an 8-CPU SGI Altix system, in the first half of 2005) for your IDL processing. Interactive time limits are much longer (essentially unlimited) on the vis server.

For some of our data analysis, we use serial queues just to get access to memory. This seems like a waste of resources. If other options are available, then it would be nice to know about them.

NERSC response: When the new vis server, DaVinci (an 8-CPU SGI Altix system) goes into production in the first half of 2005 it will likely offer batch queues that you may find useful.

Currently, most data analysis is done offsite. However, I do occasionally run serial jobs on Seaborg to perform post-processing of data. We are looking at making greater use of Escher in the future.

yes. I like the serial queues on Seaborg for post-processing.

In practice, I only use gnuplot on NERSC. More easy to use and powerful (e.g, plotting molecules) visualization tool will be helpful. They should be on seaborg, so the calculated data can be analyzed immediately. Anything requires the transfer of data to another machine will not be so useful as an analytical tool.

NERSC response: garlic, vmd and rasmol (molecular viewers) are installed on both escher and seaborg.

I occasionally submit jobs to the NERSC serial queue to process large numbers of data files. Jobs in the serial queue typically started quickly and otherwise ran fine.

I do most of my post-processing off site, so simple visualization with GNUPLOT or GRACE is sufficient.

I don't have any personal ones. My group's concerns have been met through my NUG involvement. [pre and post processing queues on Seaborg]

I am using IDL mostly, and it works fine for me.

IDL was once used. It was did not seem very interactive with the required SSH software which does not support graphics.

NERSC response: Use the -X argument to your ssh client to force X11 traffic to be tunneled from the NERSC Center back to your workstation. see How To Route Graphics Output through SSH.

No. AVS 5.6 has been dead on seaborg for a while and I am told that it may remain dead in seaborg forever. That has been very inconvenient for the instant monitoring of data output. Currently I have to transfer data from seaborg to escher for visualization and analysis. I will NERSC can restore AVS 5.6 on seaborg soon.

NERSC response: To be solved, this particular issue will require the vendor to provide a version of the application updated for the most recent version of AIX -- there is nothing that NERSC "can fix" to make AVS 5.6 work on seaborg. A trouble ticket was filed with the vendor early in Fall 2004. Note that AVS version 5.5 is available for your use on seaborg, and it is substantially similar in function to AVS version 5.6. Also, the most recent version of AVS/Express is installed and functional on seaborg.

Through routines written for IDL, running on Seaborg.

I only use xmgrace on seaborg as long as visualization because I don't know much about other softwares. ...

Suggestions: ...
2- Devote a small subset of nodes (2-4) to serial jobs for data analysis.
3- Devote one node to fast data transfer to/from HPSS, using htar, etc.

Please keep IDL.

... I do use Seaborg for some data analysis when the data set is really large and would not fit on our local filesystem or would take too long to transfer.

Don't use / don't need:   15 responses

Not applicable at this time. Could be important for in future calculations.

Don't need

I don't use data analysis and visualization at NERSC.

I do not uses these NERSC resources regularly.

Do very little data analysis and visualization at NERSC.

I do not use NERSC facilities for data analysis or visualization.

No comment, others in our group carry out the data analysis/visualization.

I do not do any visualization work.

Not used.

Don't use.

I hardly use at present visualization facilities and so I am unable to make any comments.

I don't do much Visualization and Data Analysis at NERSC.

I have none.

Do not use

We have not yet done substantial analysis and visualization to give you meaningful comment.

Requests for additional services:   11 responses

It would be nice if we have GRACE and gnuplot on the pdsf interactive machines, so that we can plots and check some data remotely. [PDSF user]

NERSC response: To request new software, please fill out the Software Request Form.

I use a package developed at LLNL (called VCDAT) to work with NetCDF files. Currently I drag the necessary results from NERSC to LLNL in order to use VCDAT. It would be useful if VCDAT were installed at NERSC. However, I understand that this would require the LLNL developers to put extra effort in too.

NERSC response: NERSC attempted to install VCDAT on Seaborg. However, there is no AIX version available and the installation failed. If VCDAT releases an AIX version we will revisit this issue.

Do you have a license for Tekplot? That would be handy for Nimrod simulation data analysis.

NERSC response: We do not have any Tecplot licenses. Each floating license costs $3200 for the initial purchase, and $640/yr thereafter (per license) in maintenance. You may formally request that the center purchase and install Tecplot by completing the Software Request Form.

Would like to do more analysis/vis on NERSC. Many of my jobs require multiple restarts on a 12 or 24 hour queue. In particular, would like to have some plots done automatically at the end of a (batch) run so no time is wasted shipping data to another computer to analyze before deciding on restarting. My present analysis programs are not designed for MPP.

I don't know if it is my fault but it seems to me that the visualization tools are not very transparent for the users. It would be useful access to some tutorial or short on-line courses that would be widely advertised to learn the visualization capacities at NERSC.

I want imagemagick on seaborg. They I could make movies there, and that would complete my viz needs. I asked and was told that this is not possible.

NERSC response: NERSC will investigate installing ImageMagick on seaborg.

Tools capable of visualization and quantitative analysis of terabytes of data are urgently needed

NERSC response: During 2004, NERSC evaluated visualization software that uses a scalable, distributed architecture. EnSight was installed on both Seaborg and Escher.

... How about sending "Tips of the month" to users by e-mail.

Suggestions:
1- Make getting accounts in Escher easier. ...

I don't know how to use the tools available at NERSC. Maybe NERSC can offer some training.

NERSC response: There is a wealth of information, including "how-to" material on the Visualization Group's website.

The visualization group has provided no useful service to me. I generate very large data sets that require the development of specialized analysis tools for each new problem. I generate the data with a parallel run, then postprocess that data either in parallel or serial to produce a reduced dataset which I then transfer to my local machine for further processing or visualization. Alternatively, my data may require no post-processing, and is visualized directly with our own tools based on demand-driven I/O and X-windows/Motif windows, and run directly on seaborg. For my application, and for many others I'd guess, the role of a viz group directly in the scientific analysis is not clear, for they should not be expected to understand or care about the sort of data derivation and analysis specific to my field.
If it were up to me, the viz group would be creating new and interesting ways to look at my data (new sorts of 2D or 3D vector/scalar/isosurface plots, with interesting new ideas to bring out features I hadn't seen). They'd have a web page full of neat and new interesting viz ideas, and links to software/code written to be applied to simplified datasets as examples in order to demonstrate the technique. Furthermore, they'd be funded to supply some ideas/manpower on how to generalize these neat new ideas into the format of an interested user. Such applications might cover the range from debugging tools, presentation graphics, web-based java/flash animation control, stereo, 2D and 3D line-integral convolution things, isosurface extraction and processing, whatever. Finally, they might even develop data standards on which they or others can build such viz tools primitives, working more with the folks in the NERSC community that must do all this stuff themselves already.
From my perspective, the viz group has access to a tremendous amount of compute hardware, and intellectual access to a huge variety of scientific research groups. However, they have neither the manpower nor the inclination to build such generalized tools or data standards for the NERSC community. The extent to which I've seen their contributions has been limited to demonstrations of viz-data pipe throughputs, and multi-lab transfer rates, etc, and my from my viewpoint these demonstrations are of very questionable utility.

NERSC response: In some cases, general purpose "hammers" are handy, while in others, more specialized tools are appropriate. In these latter cases, it is crucial that the tool makers understand the ultimate use of the tool. Generally speaking, it is crucial that the visualization community have some level of understanding of the science problem in order to produce relevant and useful technology. It sounds like your project has its own set of tools and techniques that are used for generating and analyzing data. Perhaps your project would be able to take advantage of NERSC's analysis/visualization facilities, which include (as of the time of this writing) a pair of SMP platforms: one has 12 CPUs/24GB RAM/4TB of scratch disk; the other has 8 CPUs/48GB RAM/3 TB of scratch disk. Both have excellent system balance for data analysis: favor memory size and I/O rates of raw cycles.

You make many good suggestions and the NERSC Visualization group has implemented many of them already as evidenced on our website:

  • Example visualizations
  • Novel use of standard, web-based delivery for interactive 3D visualization (QuickTime VR Object movies); see Leveraging QuickTime VR as a Delivery Vehicle for Remote and Distributed Visualization
  • Data Conversion utilities for using standard visualization tools with AMR data; see AMR Visualization at Berkeley Lab
  • We created an HDF5 "data standard" for use by the 21st Century Accelerator SciDAC; see: Particle Viewer

 

It should be kept in mind that the NERSC Visualization group is more in the deployment business than the "research and development business." As such, it is usually beyond the scope of the NERSC mission to define data standards for projects. The concept of focusing on data management and modeling to form the central implementation core of a computational science simulation and analysis project is sound. As the scope of such central cores continues to diversify, the challenge (for NERSC) is to provide a combination of software tools and infrastructure that are sufficiently flexible to be widely used across many projects. This is a moving target that requires constant input from the NERSC user community as well as constant effort to maintain.

Interactions with Visualization Group members:   10 responses

My group has done both very large cosmological simulations, which we have visualized elsewhere, and the largest program of hydrodynamic simulations of galaxy interactions, some of which we have visualized with the help of NERSC visualization staff. We now have done many more, and higher resolution, hydro simulations, and we look forward to working with the NERSC visualization folks to make state-of-the-art movies based on them. The challenge is to visualize the many dimensions represented by our outputs.

I've used these services only once or twice to produce digital movies

Yes, the Viz group has been very helpful.

Excellently met. Cristina Siegerist has been helping us generate amazing images related to our research.

Work with visualization group.

I've used the viz group once with help producing a movie and would love to use their services again. They were prompt, efficient and helpful. The majority of my data analysis is done elsewhere, but I've been satisfied with what I've done at NERSC.

Yes, our needs are largely being met by NERSC visualization services staff who have devoted substantial time to working on our color images.

I am part of one of the INCITE projects and for us visualization has been a very important tool. We as chemists are not so familiar with visualization programs, so the help we have received at NERSC has been extremely valuable for us. Specially the work done by Cristina Siegerist who has been of real value to us, she has been an incredible help for this project, she is extremely hard working and knows perfectly what she is doing so must of the times we have received much better results than what we were thinking, she always goes beyond our expectations and has even proposed new ideas of taking advantage of the information we generate.

My research group has worked with Cristina Siegerist, who has created images that allow us to visualize, for the first time, the walkers in our Monte Carlo simulation. She has also enabled us to visualize aggregate data from the ensemble of walkers. These contributions have been extremely valuable.

Escher -- making 2D movies using IDL codes developed by the visualization group

Do data analysis / visualization locally:   10 responses

We do our analysis at our own sites.

We have our own local viz expert so we do the visualization locally. ...

We don't need these services. All graphics data output is analyzed and visualized on our local LINUX workstations (using mostly XMGRACE and PLOTMTV).

Do this locally.

Most of my visualization needs are done by the visualization group at Argonne National Lab. ...

I do all analysis of the results at local machines.

I generate viz data on NERSC and visualize it on local machine. I am considering using NERSC viz tools.

Postprocessing code written by ourselves treat the data on Seaborg, and we download and visualize them locally with an LINUX machine. So far, we have not used NERSC visualization tools yet.

I do all my data analysis and visualization on local PCs running linux. Reasons are that I use software probably not installed on NERSC, and don't want to experience delays from the display of postscript graphics from NERSC to a local machine that is at the other end of the country when the connection is heavily used.

NERSC response: You might have better luck using gs to render your PostScript file to a raster image, then display the image using ImageMagick or similar utility. In some cases, using gs to render large postscript files across a slow network would be quite slow. Refer to One-Step JPEG-from-PS Creation for details on using gs to render your PostScript file to a JPEG file.

In most the cases, I copy the data files to local computers and do the visualizations.

Comments about PDSF use:   8 responses

Only use ROOT.

We have our own IDL-based routines for data and visual analysis.

We actually use PDSF for all of these.

Data analysis on PDSF. No real visualization to speak of.

All my needs are met by software which is custom to me and/or my group.

I just use PDSF

I do my data analysis using the batch queues on PDSF. Its an extremely useful system for me

 

switch to sge is great!

Services meet our needs:   7 responses

The present visualization resources are adequate for my needs.

satisfied

Yes, it meets my needs so far.

present service is adequate.

Yes

Yes.

Adequate

Comments about the Math Server Newton:   1 response

Access to Matlab on Newton.