NERSCPowering Scientific Discovery Since 1974

Data Analysis and Mining


Data analysis techniques include post-processing (e.g., data statistics) of experimental datasets and/or simulation output, as well as the use of mathematical methods (e.g., filtering data) and statistical tests. Data mining usually refers to the application of more advanced mathematical techniques such as classification, clustering, pattern recognition, etc.

Quick Links: NERSC Tools for Data Analysis and Data Mining

Visualization and Analysis for Nanoscale Control of Geologic Carbon Dioxide

Goals * Collect experimental 2D-3D imaging data in order to investigate fluid-fluid and fluid-rock interactions;* Provide algorithms for better understanding of processes governing fluid-fluid and fluid-rock systems, related to geologic sequestration of CO2;* Develop image processing methods for analyzing experimental data and comparing it to simulations;* Detect/reconstruct material interfaces, quantify contact angles, derive contact angle distribution, etc. Impact * Unveil knowledge required… Read More »


Analysis of Void Space of Porous Materials Used in Energy-related Applications

We have developed partial differential equations-based tools that perform analysis of porous materials. These tools involve the application of the Fast Marching Method (FMM) to predict if a molecule can traverse through a channel system representing void space of the materials, map accessible parts of these void spaces and calculate accessible volumes and surfaces. (More… Read More »


Modeling structural properties of breast cancer cells

Goals PSOC is a large, multi-institution effort to study physical mechanisms underlying tumor progression in breast cancer. One goal is to use 3D culture models, confocal imaging, and simulation experiments to show how mechanical forces affect proteins, cells and tissues - this particular effort has count on active NERSC, LBNL and UC Berkeley collaboration. To do so, we have built image analysis algorithms that take time slices of stacks of two-dimensional images, representing… Read More »


Overview MySGE allows users to create a private Sun GridEngine cluster on large parallel systems like Hopper or Franklin.  One the cluster is started, users can submit serial jobs, array jobs, and other through-put oriented workloads into the personal SGE scheduler.  The jobs are then run within the user private cluster. How it works When the user executes vpc_start, a job is submitted to the standard system scheduler (Moab).  The user can specify the requested time and number of cores using… Read More »