Information and resources to help programmers achieve maximum performance on their applications with an emphasis on preparing for Cori with its Intel Xeon Phi KNL processors.
We expect many applications will need to make code modifications in order to run efficiently on Cori's Knights Landing manycore architecture. To run well on Cori, your application will need to have good thread scalability, take advantage of vectorization opportunities, and manage multiple hierarchies of memory effectively. Read More »
Arithmetic intensity is a measure of floating-point operations (FLOPs) performed by a given code (or code section) relative to the amount of memory accesses (Bytes) that are required to support those operations. It is most often defined as a FLOP per Byte ratio (F/B). This application note provides a methodology for determining arithmetic intensity using Intel's Software Development Emulator Toolkit (SDE) and VTune Amplifier (VTune) tools. A tutorial on using SDE on Edison can be found here,… Read More »
Some users have noted performance variability in the execution of their applications. There are many potential sources of variability on an HPC system and NERSC has identified the following best practices to mitigate variability and improve application performance. This document is intented for users who are familiar with NERSC systems such as Edison. For new NERSC users. hugepages Use of hugepages can reduce the cost of access memory, especially in the case of many `MPI_Alltoall` operations. Read More »