Crossroads/NERSC-9 Application Performance: Instructions and Run Rules
Application performance is a key driver for the DOE’s NNSA and Office of Science platform roadmaps. As such, application benchmarking and performance analysis will play a critical role in evaluation of the Offeror’s proposal. The APEX application benchmarks have been carefully chosen to represent characteristics of the expected Crossroads and NERSC-9 workloads, both of which consist of solving complex scientific problems using diverse computational techniques at large scale and high levels of parallelism. The applications will be used as an integral part of the system acceptance test and as a continual measurement of performance throughout the operational lifetime of the systems.
An aggregate performance measure, Scalable System Improvement (SSI), will be will be used in evaluating the application performance potential of the Offeror’s proposed system.
Specific run rules for each benchmark will be included with the respective benchmark distribution, supplying source code, benchmark specific requirements and instructions for compiling, executing, verifying numerical correctness and reporting results.
The application benchmarks, whether derived from a real application or a mini-application, are a representation of the NNSA and NERSC workloads and span a variety of algorithmic and scientific spaces. The list of application benchmarks is contained in the RFP Technical Requirements document.
Each application is a separate distribution and contains a README.APEX file describing how to build and run it as well as any supporting library requirements. Note that each respective README contains its own instructions and run rules and thus must be considered as a supplement to this document. If there is a discrepancy between the two the README takes precedence. If anything is unclear, please notify APEX.
For each application, multiple problem sizes will be defined:
- The small problem is intended to be used for node level performance analysis and optimization
- A medium problem may be defined if the APEX team feels it would be of benefit to the Offeror for investigating inter-node communications at a relatively small scale
- The reference (or large) problem will be of sufficient size to represent a current production workload on current platforms
- A grand challenge problem may be defined and is meant to represent a production problem in the operational time frame of the APEX platforms.
The reference (large) problem will be used in the calculation of the SSI baseline parameters. The baseline SSI will be derived from a current generation platform, e.g. NERSC’s Edison and/or NNSA’s Cielo. Reference times (or figures of merit) and platform specifics will be detailed in the APEX provided spreadsheet.
APEX will define the problem sizes to be used by the Offeror in determining their proposed system benchmark results. This includes the weights (w) and capability factors (c) used in the SSI calculation. These problem definitions and factors will be used by the Offeror in the calculation of SSI, and will be either the reference (large) or grand challenge problem, depending on scalability and application drivers for the respective program workloads.
For any given problem, the Offeror is allowed to decompose the problem (in a strong scaling sense) as necessary for best performance on their proposed system, with the exception of 1) any constraints inherent in the codes and 2) any rules pertaining to the calculation of SSI.
The base set of results must use the programming method provided by the APEX provided distribution of the respective application. The Offeror is allowed to use any version of a given programming method (e.g. the MPI and OpenMP standards) available and supported for the proposed system which provides the best results and meets any other requirements specified in the RFP Technical Requirements document.
The base case is necessary to provide a point of reference relative to known systems and to ensure that any proposed system can adequately execute legacy codes. The base case will be used to understand baseline performance for the applications and will be used to understand the potential for application performance improvement when compared against the optimized case. The following conditions must be met for base case results:
- The full capabilities of the code are maintained, and the underlying purpose of the benchmark is not compromised
- Any libraries and tools used for optimization, e.g. optimized BLAS libraries, compilers, special compiler switches, source preprocessors, execution profile feedback optimizers, etc., are allowed as long as they will be made available and supported as part of the delivered system
- Any libraries used must not specialize or limit the applicability of the benchmark nor violate the measurement goals of a particular benchmark
- All input parameters such as grid size, number of particles, etc., must not be changed
- All results must pass validation and correctness tests.
The optimized set of results allows the Offeror to highlight the features and benefits of the proposed system by submitting benchmarking results with optimizations beyond those of the base case. Aggressive code changes that enhance performance are permitted. The Offeror is allowed to optimize the code in a variety of ways including (but not limited to):
- An alternative programming model
- An alternative execution model
- Alternative data layouts.
The rationale and relative effect on performance of any optimization shall be fully described in the response.
The Offeror must provide results for their proposed platform for all applications defined by the RFP Technical Requirements document.
All benchmark results for the proposed system shall be recorded in a spreadsheet, which will be provided by APEX. If results are simulated, emulated and/or performance projections are used, this must be clearly indicated and all methods, tools, etc. used in arriving at the result must be specified. In addition, each surrogate system used for projections must be fully described.
The Offeror shall submit electronically the benchmark spreadsheet, benchmark source codes, compile/build scripts, output files and documentation of any code optimizations or configuration changes as described in the run rules. Preferably on a USB thumb drive, CD, or similar medium. Do not include files that cannot be easily read by a human (e.g. object files, executables, core dump files, or large binary data files). An audit trail showing any changes made to the benchmark codes must be supplied and it must be sufficient for APEX to determine that the changes made conform to the spirit of the benchmark and do not violate any specific restrictions.
- NERSC Edison Supercomputer, http://www.nersc.gov/systems/edison-cray-xc30/.
- NNSA Cielo Supercomputer, http://www.lanl.gov/projects/cielo/.