NERSCPowering Scientific Discovery Since 1974

Performance and Optimization

Note: some of the performance tips and recommendations provided in this page were based on the Edison Phase I test resutls. While we do not expect major chagnes in our recomendations, we will sill update this webpage when we have new data for Edison Phase II.

Performance comparison between Edison and Hopper

Edison used the same benchmark suite as Hopper, the NERSC-6 benchmark suite, to measure system performance throughout its procurement process. Instead of peak flops, NERSC uses the sustained system performance (SSP) to measure the system computational capability. Edison is 2-3 times faster than Hopper with the seven applications used for SSP. Read More »

Compiler Comparisons

Using a set of benchmarks described below, different optimization options for the different compilers on Edison.  The compilers are also compared against one another on the… Read More »

Math Library Performance

Fully optimizing a given application’s performance often requires a deep understand of the source, an accurate profile for a representative run and the ability to have changes to the source accepted upstream. However, in many cases, significant performance gains can be achieved by simply optimizing the code over the matrix of possible compilers, compiler options and libraries available on a given machine. Here, we explore the performance variability of common materials science… Read More »

Core Specialization

Core Specialization (CS) is a feature of the Cray operating system that allows the user to reserve one or more cores per node for handling system services, and thus reduce the effects of timing jitter due to interruptions from the operating system at the expense of (possibly) requiring more nodes to run an application. The specialized cores may also be used in conjunction with Cray's MPI asynchronous progress engine [1] to improve the overlap of communication and computation for applications… Read More »

Hyper-Threading

Edison includes Intel processors with Hyper-Threading Technology. When Hyper-Threading (HT) is enabled, the operating system recognizes each physical core as two logical cores. Each of the two logical cores has resources to store a program state, but they share most of their execution resources. Thus, two independent streams (i.e., processes or threads) can run simultaneously on the same physical core, but at roughly half the speed of a single stream. If a stream running on one of the logical… Read More »