Optimization technology: Menu
Below are some of the (many) transformations and optimizations that VAST is capable of. Not all of these will apply to every target system and compiler. This can be viewed as a menu of available features; these features can be configured as needed to best serve the target system.
VAST's main job is to make the rest of the compilation system look
good by doing whatever it can at a high level to produce loops and
constructs that will be efficiently processed by the compiler.
Based on interaction with our customers, we configure these optimizations in VAST to target
the strengths of the compilation system and reinforce any weak spots.
Data Dependency Analysis
- Symbolic expression analysis.
- Symbolic assignment analysis.
- Multidimensional array/pointer/structure analysis.
- Pointer disambiguation.
- Runtime testing.
- Rich set of directives and switches.
- Assertion levels.
Nested Loop Transformations
- Loop interchange.
- Outer loop unrolling (inside of inner).
- Loop collapse (to one long loop).
- Nested loop fusion.
- Outer stripping.
- Invariant removal from inner loop.
Inner Loop Transformations
- Loop rerolling.
- Loop unrolling.
- Reduction unrolling.
- Loop splitting.
- Loop fusion (jamming).
- Loop peeling.
- Loop rotation (pipelining).
- Dead store elimination.
Vector SIMD optimizations
- Alignment on the fly.
- Scalar propagation.
- Vector reduction.
- Vector load lifting.
- Complex data type.
- Runtime alignment testing.
- Runtime stride testing.
- Data width minimization.
- Mixed data size loops.
- Vectorized outer loops.
- Non-fixed iteration counts.
- Isolation/minization of serial code.
- Exploitation of architecture-specific operations.
Automatic Parallelization/Threading Optimizations
- Shared/private variable determination.
- Parallel regions outlining - creation of parallel functions.
- Expansion of parallel regions to include non-parallel outer loops.
- Combining parallel regions to reduce overhead.
- Parallel reductions.
- Private arrays.
- Barrier removal optimization.
- OpenMP Support.
- Interprocedural analysis to support pointer disambiguation, argument evaluation.
- Cross-file inline function expansion.
Library Call Generation/Idiom Recognition
- Matrix multiply, matrix-vector multiply, rank one update.
- Vector Arithmetic: SDOT, DAXPY, etc.
- Table lookup, search loops.
- Index of min/max, linear recursion.
- String copies, memory copies, memory clear.
Memory and Cache Optimizations
- Load/Store motion.
- Cache memory -- Block/tile/strip loops.
- Change declarations (avoid powers of 2).
- Optimize separate data memories.
- Insert cache line prefetch operations.
- Optimize use of vector registers.
- Loop invariant IF relocation (un-switching).
- Conditional cleanup.
- Index value testing.
- Branch optimization (separating test from branch).
- Conditional move (rather than branch).
- Complex control flow.
Loop Expression Optimizations
- Global constant propagation.
- Common subexpression re-association.
- Scalar division relocation.
- Array constant renaming.
- Index variable cleanup.
- Mixed mode optimization.