Table of Contents
High Performance on the J90 Systems
Philosophical Ramblings
J90 Potential
STREAM Results
STREAM Results (cont.)
Tools
Program ìSLOWî
No Optimization
Moderate Optimization
High Optimization
Optimization Results
2 CPU Speedup
3 CPU Speedup
4 CPU Speedup
Useful F90 Options
Using flowtrace/flowview
Using prof
profview Output
Optimization Strategies
Scalar Optimization
Vectorization
Inhibitors to Vectorization
Nonvectorizable Code
Inlining
Pushing
Splitting
Splitting (cont.)
Scalar Recurrence
Scalar Recurrence (cont.)
Compiler Vector Directives
Parallel Computing
Parallelism
Parallelism, cont.
Data ìScopingî
Compiler Tasking Directives
Threshold Test
Helping F90 with Parallelism
Helping F90 with Directives
Helping F90 with Directives, cont.
atexpert
atexpert Output
atexpert Output, cont.
atexpert Output, cont.
|
Author: Thomas M. DeBoni
Email: TMDeBoni@LBL.GOV
Download presentation source
|