NERSCPowering Scientific Discovery Since 1974

PSNAP

Description

PSNAP is the PAL System Noise Activity Program from the Performance and Architecture Laboratory at Los Alamos National Laboratory.  It consists of a spin loop that is calibrated to take a given amount of time (typically 1 ms). This loop is repeated for a number of iterations. The actual time each iteration takes is recorded.  Analysis of those times allows one to quantify operating system interference or noise.

Download

psnap-1.2tar file made available June 28, 2013

How to Build

P-SNAP depends on MPI (although it does not make any measurements depending on MPI performance).  It does use MPI_Wtime() by default for microsecond timing.  You can use gettimeofday() instead by compiling P-SNAP with the -DUSE_GETTIMEOFDAY flag.

To compile P-SNAP, edit the Makefile provided and define the C compiler and paths to MPI include files and libraries. Then type 'make'.

How to Run

P-SNAP should be run like any other MPI program. To accurately measure the noise on your system, run one PSNAP process per CPU (core) on each node. To obtain an accurate sample of the system noise a typical run would be for 1,000,000 repetitions. Such runs take approximately 1/2 hour to run, not counting any I/O time.

The full set of options:

Usage: psnap [OPTIONS]

-n <reps> number of repetitions
               default: 100000
-w <reps> number of warm-up repetitions
               default: 10% of the number of reps
-c <count> calibration count
               default: perform a calibration to match granularity
-g <usecs> granularity of the test in microseconds
               default: 1000
-b <count> perform a barrier every "count" loops
               default: no
-h this message

Example:
psnap -n 1000000 -w 10 > psnap.out
runs a test with 1000000 repetitions and 10 warm-up reps.

Required Runs

PSNAP is intended to run across the entire proposed system, so you should run one PSNAP MPI process per core on all cores on all nodes, fully packed.  Use 1,000,000 repetitions (-n 1000000).

The operating system used for the PSNAP run(s) must be configured as the system would be delivered to and used at NERSC for regular, production purposes.

Validation

Output format

At the beginning of the run the program calibrates the timing loop to take <granularity> microseconds. As the calibration is running, the program outputs the calibration count from each rank:

#rank= 1 count= 1007043 time= 999 difference= 1 tolerance =1
#rank= 7 count= 1007043 time= 999 difference= 1 tolerance =1
#rank= 9 count= 1006036 time= 1000 difference= 0 tolerance =1
#rank= 15 count= 1007049 time= 1000 difference= 0 tolerance =1
...

Next we have mycount, which is the max(most productive node), min and location(rank), the max and location (rank).  We calibrate using the count from the node that got the most work done, i.e., the one that is least noisy.

my_count= 1005025 global_min= 1005025 min_loc= 0 global_max= 1007049 max_loc= 13
Using Global max for calibration

 The main output from P-SNAP is a histogram of the actual time taken to run the loop for each MPI task. At the beginning of each histogram a line appears with a '#c' in the first column.  This line contains the summary total time for the rank. The columns are MPI rank, total time, hostname.

#c 1 1002103 pal0

The histogram for this rank follows.  The histogram is binned by time and the colums are MPI rank, time (microseconds), count, hostname:

0 1002 89 p100
0 1003 42043 p100
0 1004 17437 p100
0 1005 2036 p100
0 1006 273 p100
...

Data analysis (rough guide)

You can get a rough idea of the uniformity of the processors from the initial calibration. All processors should have a similar count. Use grep to find  "#rank" to examine the individual counts. In the past we have identified processors that are different clock rates from this information.

The summary of the histograms can be combined by node name and a % slowdown can be calculated, a script like:

cat $1 | awk '/^#c/ {sum[$NF]+=$3; tally[$NF]++} END {for (m in sum) {printf "%10d %s\n", sum[m], m}}'  

A sample reduce script is included in the distribution.  Run this script with the name of the PSNAP output file as its command line argument and save its output in a file, like this:

./psnap_reduce psnap.out > psnap.out.agg

The result of this will be a file with two columns. The number of rows in the file should equal to the number of nodes in your system. The first column contains the sum of the times to run PSNAP across the cores in that node. The second column contains the node name.

Then calculate the percent slowdown for each node and the maximum, average, and minimum across all nodes. The slowdown is (observed time/expected time)-1. The expected value is n*1000*reps. Here, n is the number of cores per node on your system and reps is the number of PSNAP repetitions. When the code is run with the "-n 1000000" command line argument then reps is 1,000,000.

A sample Microsoft Excel spreadsheet is provided.

Note: The following is not required for Trinity/NERSC-8.  Further analysis of the PSNAP data is also possible by plotting a histogram of the noise with iteration time on the x axis and tally on the y axis. A noise-free system has a single spike at 1000 us (or whatever the base iteration time is set to). Other spikes generally indicate kernel noise because only the kernel can schedule work at precise intervals. Mounds generally indicate noise from daemons because there's usually a Gaussian distribution of when they get scheduled and how long they run.

Finally, you can combine the histograms for each rank in an OS and plot to see the distribution of delays. 

Reporting 

Report the number of cores on which the test was run and the Benchmarked Average Deviation (as a percent). 

Return to NERSC the following:
- original PSNAP output file;
- output from the "psnap_reduce" aggregation script;
- the values of the maximum, average, and minimum percent slowdown that you obtained;
- the value of the average percent slowdown must be entered into Table 4 in the NERSC 6 benchmarking instructions document.

Authorship

PSNAP was developed by the Performance and Architecture Laboratory, formerly of Los Alamos National Laboratory.

http://www.c3.lanl.gov/pal/