Optimizing Programs

Once created, SDFGs can be optimized by a variety of means.


The experimental automatic optimization heuristics provides a good (but likely not optimal) starting point. Be sure to install and use fast libraries, if they exist for your hardware.

Manual optimization can then provide more fine-grained control over the performance aspects of the code. To determine which optimizations work best, SDFGs can be analyzed for performance bottlenecks using Profiling and Instrumentation, or visually through static and runtime analysis. With this information in hand, the SDFG can be optimized using transformations, passes, or auto-tuning.

The most common type of optimization is local transformations. For example, the InlineSDFG transformation inlines a nested SDFG into its parent SDFG. This transformation is applied to a single SDFG, and does not require any information from other SDFGs. As opposed to local transformations, passes are globally applied on an SDFG, are can be used to perform whole-program analysis and optimization, such as memory footprint reduction in TransientReuse.

Transforming a program is often a sequence of iterative operations. Some transformations do not necessarily improve the performance of an SDFG, but are a “stepping stone” for other transformations, for example MapTiling on a map can lead to InLocalStorage being available on the memlets. When working with specific platforms, be sure to read the Best Practices documentation entries linked below. It is also recommended to read vendor-provided documentation on how to maximize performance on that platform.

Finally, our experimental auto-tuning API allows for automatic optimization of SDFGs by searching over the set of possible configurations. This is done by evaluating the performance of each configuration and selecting the best one. For example, MapPermutationTuner automatically tunes the order of multi-dimensional maps for the best performance, and DataLayoutTuner globally tunes the data layout of arrays.

The following resources are available to help you optimize your SDFG:

The following subsections provide more information on the different types of optimization methods: