Intel Compiler Peformance on Edison
These are the Intel optimization options we compared. The quotations are from the Intel compiler on-line man pages.
|-fast||This "maximizes speed across the entire program". It is a very high level of optimization and includes interprocedural optimization across different source files. It increases compilation time significantly, and occasionally compiles will fail with this option which succeed with the other options, probably due to the greater processor and memory requirements.|
|-fast -no-ipo||This includes all of the optimizations of the -fast option except for interprocedural optimization.|
|default||No optimization flags. By default the Intel compiler has a high level of optimization. It is comparable to the -O2 optimization level which "enables optimizations for speed", and is the recommended option for codes in the online man page.|
|-O3||This performs all of the -O2 options as well as additional more aggressive loop transformations. It is recommended for "applications that have loops that heavily use floating-point calculations and process large data sets."|
|-O3 -unroll-aggressive -opt-prefetch||This was recommended to us by benchmarkers as being a good supplement to the -O3 optimizations.|
|-O||The same as -O2.|
The optimizations associated with -fast produce faster code on the average than other options. Adding -no-ipo to -fast significantly reduces the compilation time, but still produces very well optimized code.