Configuration entry reference
The following configuration entries are available in .dace.conf, as part of the API, and as environment variables.
See Configuring DaCe for more information on how to use the interface.
- cache
Compiled cache entry naming policy
Type:
strDescription: Determine the name of the generated
.dacecachefolder:
nameuses the name of the SDFG directly, causing it to be overridden by other programs using the same SDFG name.
hashuses a mangled name based on the hash of the SDFG, such that any change to the SDFG will generate a different cache folder.
uniqueuses a name based on the currently running Python process at code generation time, such that no caching or clashes can happen between different processes or subsequent invocations of Python.
singleuses a single cache folder for all SDFGs, saving space and potentially build time, but disallows executing SDFGs in parallel and caching of more than one simultaneous SDFG.Default value:
name
- call_hooks
Hooks before/after every DaCe program call
Type:
strDescription: A comma-separated list of functions (or Context Manager classes) that will be called before every DaCe program (SDFG) is compiled and run. Used for functionality such as automatic tuning or instrumentation.
Default value: (Empty)
- compiled_sdfg_call_hooks
Hooks before/after every compiled SDFG call
Type:
strDescription: A comma-separated list of functions (or Context Manager classes) that will be called before every compiled SDFG’s generated code is invoked. Used for functionality such as low-level profiling.
Default value: (Empty)
- debugprint
Debug printing
Type:
boolDescription: Enable verbose printouts.
Default value:
False
- default_build_folder
Default SDFG build folder
Type:
strDescription: Default folder in which compiled DaCe programs and SDFGs are stored. Can either be a relative path (by default) or absolute.
Default value:
.dacecache
- external_transformations_path
External transformations path
Type:
strDescription: Path to a directory containing external transformations that are not included in the main DaCe package. This path is added to the Python path and can be used to import custom transformation modules.
Default value:
$HOME/dace_transformations/external_transformationsDefault value (on Windows):
%USERPROFILE%\\dace_transformations\\external_transformations
- profiling
Profiling
Type:
boolDescription: Enable profiling support.
Default value:
False
- profiling_status
Status bar for profiling
Type:
boolDescription: Enable tqdm status bar while profiling. If tqdm is not installed a warning will appear. To disable this feature (and the warning) set this option to false.
Default value:
True
- progress
Progress reports
Type:
boolDescription: Enable progress report printouts.
Default value:
True
- store_history
Store SDFG transformation history
Type:
boolDescription: Store the history of transformations on the SDFG file.
Default value:
True
- treps
Profiling Repetitions
Type:
intDescription: Number of times to run program for profiling.
Default value:
100
compiler
Preferences of the compiler
- compiler.allow_shadowing
Allow variable shadowing
Type:
boolDescription: Allowing shadowing of variables in the code (reduces exceptions to warnings when shadowing is encountered).
Default value:
True
- compiler.allow_view_arguments
Allow numpy views as arguments
Type:
boolDescription: If true, allows users to call DaCe programs with NumPy views (for example, “A[:, 1]” or “w.T”). As this can create pointer aliasing issues with two arrays pointing to the same memory, or analyzability issue with strides and alignment, this option is disabled by default.
Default value:
False
- compiler.build_folder_mode
Save mode for the build folder
Type:
strDescription: Selects which content should be saved in the build folder. Two modes are currently supported: development, that includes everything; and production, that saves only the compiled library and the folder mode file.
Default value:
development
- compiler.build_type
Build configuration
Type:
strDescription: Configuration type for CMake build (can be Debug, Release, RelWithDebInfo, or MinSizeRel).
Default value:
RelWithDebInfo
- compiler.codegen_lineinfo
Annotate code generator lines
Type:
boolDescription: Keep a source mapping between generated code and the file/line of the code generator that generated it. Used for debugging code generation.
Default value:
False
- compiler.codegen_state_struct_suffix
Suffix used by the code generator to mangle the state struct.
Type:
strDescription: For every SDFG the code generator is is processing a state struct is generated. The typename of this struct is derived by appending this value to the SDFG’s name. Note that the suffix may only contains letters, digits and underscores.
Default value:
_state_t
- compiler.cpp_standard
C++ standard version
Type:
strDescription: C++ standard to use for compilation (e.g., 14, 17, 20, 23, 26).
Default value:
20
- compiler.default_data_types
Default data types
Type:
strDescription: Specify the default data types to use in generating code. If “Python”, Python’s semantics will be followed (i.e., float and int are represented using 64 bits). If the property is set to “C”, C’s semantics will be used (float and int are represented using 32bits).
Default value:
Python
- compiler.extra_cmake_args
Additional CMake configuration arguments
Type:
strDescription: If set, specifies additional arguments to the initial invocation of
cmake.Default value: (Empty)
- compiler.format_code
Format code with clang-format
Type:
boolDescription: Formats the generated code with clang-format before saving the files.
Default value:
False
- compiler.format_config_file
Path to the .clang-format file
Type:
strDescription: Clang-format file to be used by clang-format, only used if format_code is true
Default value: (Empty)
- compiler.indentation_spaces
Indentation width
Type:
intDescription: Number of spaces used when indenting generated code.
Default value:
4
- compiler.inline_sdfgs
Inline all nested SDFGs
Type:
boolDescription: If set to true, inlines all nested SDFGs upon code generation by default.
Default value:
False
- compiler.library_extension
Library extension
Type:
strDescription: File extension of shared libraries.
Default value:
soDefault value (on Linux):
soDefault value (on Windows):
dllDefault value (on Darwin):
dylib
- compiler.library_prefix
Library prefix
Type:
strDescription: Filename prefix for shared libraries.
Default value: (Empty)
Default value (on Linux):
libDefault value (on Darwin):
lib
- compiler.lineinfo
Add line info
Type:
strDescription: Wether or not to add line info from the parsed code in the generated SDFG. Valid options are inspect and none. “inspect”: During parsing, inspect the python call stack and automatically add line info from the parsed source code in the resulting SDFG. “none”: Do not save any line info in the resulting SDFG.
Default value:
inspect
- compiler.max_stack_array_size
Max stack-allocated array size (bytes)
Type:
intDescription: All stack allocated arrays (i.e. StorageType.Register) with size larger than this will be allocated on the heap.
Default value:
65536
- compiler.unique_functions
Generate unique functions
Type:
strDescription: Determine if and how to generate the code for equivalent NestedSDFGs: “hash”: hashing is used to determine if multiple NestedSDFGs with equivalent contents exist. If this is the case, the code is generated only once. “unique_name”: the unique_name property of SDFG is used to determine if two NestedSDFGs are equal, generating the code only once. This gives more control to the programmer, that can explicitly decide what NestedSDFG code can be replicated and what not. “none”: a separate function is code generated for each NestedSDFG
Default value:
hash
- compiler.use_cache
Use cache
Type:
boolDescription: If enabled, does not recompile code generated from SDFGs if shared library (.so/.dll) file is present.
Default value:
False
compiler.cpu
CPU compiler preferences
- compiler.cpu.args
Arguments
Type:
strDescription: Compiler argument flags
Default value:
-fPIC -Wall -Wextra -O3 -march=native -ffast-math -Wno-unused-parameter -Wno-unused-labelDefault value (on Windows):
/O2 /fp:fast /arch:AVX2 /D_USRDLL /D_WINDLL /D__restrict__=__restrict
- compiler.cpu.executable
Compiler executable override
Type:
strDescription: File path or name of compiler executable
Default value: (Empty)
- compiler.cpu.libs
Additional libraries
Type:
strDescription: Additional linked libraries required by target
Default value: (Empty)
- compiler.cpu.openmp_sections
Use OpenMP sections
Type:
boolDescription: If set to true, multiple connected components will generate “#pragma omp parallel sections” code around them.
Default value:
False
compiler.cuda
GPU (CUDA/HIP) compiler preferences
- compiler.cuda.allow_implicit_memlet_to_map
Allow the implicit conversion of Memlets to Maps during code generation.
Type:
boolDescription: If
truethe code generator will implicitly convert Memlets that cannot be represented by a native library call, such ascudaMemcpy()into Maps that explicitly copy the data around. If this value isfalsethe code generator will raise an exception if such a Memlet is encountered. This allows the user to have full control over all Maps in the SDFG.Default value:
True
- compiler.cuda.args
nvcc Arguments
Type:
strDescription: Compiler argument flags for CUDA
Default value:
-Xcompiler -march=native --use_fast_math -Xcompiler -Wno-unused-parameterDefault value (on Windows):
-O3 --use_fast_math
- compiler.cuda.backend
Compilation backend
Type:
strDescription: Backend to compile for (‘auto’ for automatic detection, ‘cuda’ for NVIDIA, or ‘hip’ for AMD).
Default value:
auto
- compiler.cuda.block_size_lastdim_limit
Maximum last dimension thread-block size in code generation
Type:
intDescription: Threshold for the GPU code generator to fail in generating a kernel with a specified larger block size in the third dimension. Default value is derived from hardware limits on common GPUs.
Default value:
64
- compiler.cuda.block_size_limit
Maximum thread-block size in code generation
Type:
intDescription: Threshold for the GPU code generator to fail in generating a kernel with a specified overall larger block size. Default value is derived from hardware limits on common GPUs.
Default value:
1024
- compiler.cuda.cuda_arch
Additional CUDA architectures
Type:
strDescription: Additional CUDA architectures (separated by commas) to compile GPU code for, excluding the current architecture on the compiling machine.
Default value:
60
- compiler.cuda.default_block_size
Default thread-block size
Type:
strDescription: Default thread-block size for GPU kernels when explicit GPU block maps are not defined. Can be set to ‘max’ to maximize occupancy.
Default value:
32,1,1
- compiler.cuda.dynamic_map_block_size
Thread-Block size for GPU_ThreadBlock_Dynamic
Type:
strDescription: Thread-Block size for maps using GPU_ThreadBlock_Dynamic scheduler. Can be set to ‘max’ to maximize occupancy.
Default value:
128,1,1
- compiler.cuda.dynamic_map_fine_grained
Enable fine grained load balancing for GPU_ThreadBlock_Dynamic
Type:
boolDescription: If true the scheduler will dynamically redistribute the combined work of all threads in the warp equally across the warp (fine grained). Otherwise, each warp works sequentially only on its tasks (potential load imbalance).
Default value:
True
- compiler.cuda.hip_arch
Additional HIP architectures
Type:
strDescription: Additional HIP architectures (separated by commas) to compile GPU code for, excluding the current architecture on the compiling machine.
Default value:
gfx906
- compiler.cuda.hip_args
hipcc Arguments
Type:
strDescription: Compiler argument flags for HIP
Default value:
-fPIC -O3 -ffast-math -Wno-unused-parameter
- compiler.cuda.libs
Additional libraries
Type:
strDescription: Additional linked libraries required by target
Default value: (Empty)
- compiler.cuda.max_concurrent_streams
Concurrent execution streams
Type:
intDescription: Maximum number of concurrent CUDA/HIP streams to generate. Special values: -1 only uses the default stream, 0 uses infinite concurrent streams.
Default value:
0
- compiler.cuda.mempool_release_threshold
Memory pool memory release threshold
Type:
intDescription: A value that determines how large a memory allocation has to be before it is automatically released from the memory pool to the system. The default is -1, which indicates “never release”. Other values may be 0 (always release), or any byte value. For more information, see
cudaMemPoolAttrReleaseThresholdin the CUDA toolkit documentation.Default value:
-1
- compiler.cuda.path
CUDA/HIP path override
Type:
strDescription: Path to CUDA toolkit or ROCm/HIP root directory
Default value: (Empty)
- compiler.cuda.persistent_map_SM_fraction
Fraction of SMs to use for persistent GPU map
Type:
floatDescription: Sets the fraction of the number of SMs of the Device that the GPU_Persistent map can use. Together with persistent_map_occupancy this specifies the grid size of the kernel being launched. 0.0 < persistent_map_SM_fraction <= 1.0 The fraction will be rounded up to the next integer number of SMs. The max value of SMs that can/will be used is equal to cudaDevAttrMultiProcessorCount.
Default value:
1.0
- compiler.cuda.persistent_map_occupancy
Number of blocks to launch per SM used
Type:
intDescription: Sets the number of thread block to be launched per SM being used. Essentially this is a simple multiplier to persistent_map_SM_fraction. It is up to the user to check if the resulting number of thread blocks can run efficiently on the GPU.
Default value:
2
- compiler.cuda.syncdebug
Synchronous Debugging
Type:
boolDescription: Enables Synchronous Debugging mode, where each library call is followed by full-device synchronization and error checking.
Default value:
False
- compiler.cuda.thread_id_type
Thread/block index data type
Type:
strDescription: Defines the data type for a thread and block index in the generated code. The type is based on the type-classes in
dace.dtypes. For example,uint64is equivalent todace.uint64. Change this setting when large index types are needed to address memory offsets that are beyond the 32-bit range, or to reduce memory usage.Default value:
int32
compiler.linker
Linker preferences
- compiler.linker.args
Arguments
Type:
strDescription: Linker argument flags
Default value: (Empty)
Default value (on Darwin): (Empty)
Default value (on Windows): (Empty)
- compiler.linker.executable
Linker executable override
Type:
strDescription: File path or name of linker executable
Default value: (Empty)
compiler.mpi
MPI compiler preferences
- compiler.mpi.executable
Compiler executable override
Type:
strDescription: File path or name of compiler executable
Default value: (Empty)
experimental
Experimental features
- experimental.check_race_conditions
Check race conditions
Type:
boolDescription: Check for potential race conditions during validation.
Default value:
False
- experimental.validate_undefs
Undefined Symbol Check
Type:
boolDescription: Check for undefined symbols in memlets during SDFG validation.
Default value:
False
frontend
Python frontend preferences
- frontend.avoid_wcr
Avoid using WCR for augmented assignments when possible
Type:
boolDescription: Perform a map-symbol-dependency check on the write-subsets of augmented assignments that appear inside Maps to avoid using WCR when possible. This feature works correctly only when there is a single augmented assignment for each data dimension inside a Map.
Default value:
False
- frontend.cache_size
Program cache size
Type:
intDescription: The number of compiled programs to cache (based on argument types, closure constants, and closure array types) to avoid reparsing/compiling when calling a @dace.program or method.
Default value:
32
- frontend.check_args
Check arguments on SDFG call
Type:
boolDescription: Perform an early type check on arguments passed to an SDFG when called directly (from
SDFG.__call__). Another type check is performed when calling compiled SDFGs.Default value:
False
- frontend.dont_fuse_callbacks
Do not fuse callbacks
Type:
boolDescription: Stricter mode of operation where callbacks into Python don’t participate in state fusion transformations.
Default value:
False
- frontend.implicit_recursion_depth
Auto-parsing recursion depth
Type:
intDescription: The maximum call-stack depth allowed when automatically parsing called dace functions or methods.
Default value:
64
- frontend.preprocessing_passes
Number of preprocessing passes on Python code
Type:
intDescription: Number of times to run the Python preprocessing passes (e.g., constant folding) on the input code. Set to zero to disable preprocessing optimizations, set to -1 to run until the code has not changed.
Default value:
5
- frontend.raise_nested_parsing_errors
Raise nested parsing errors
Type:
boolDescription: Raise all errors out of nested function parsing contexts instead of trying to create a callback implicitly.
Default value:
False
- frontend.typed_callbacks_only
Only allow typed callbacks
Type:
boolDescription: Stricter mode of operation where callbacks into Python must have explicit return value types in order to compile.
Default value:
False
- frontend.unroll_threshold
Automatic unroll loop size threshold
Type:
intDescription: Threshold for automatic loop unrolling of any generator (e.g., including
range) with a compile-time size. A value of -1 (default) means not to unroll any loop automatically, a value of 0 means unrolling every loop, and a value above zero sets a size threshold beyond which a constant-sized loop will not be automatically unrolled.Default value:
-1
- frontend.verbose_errors
Show preprocessed AST on parsing errors
Type:
boolDescription: Prints out the preprocessed unparsed AST in case of a parsing error.
Default value:
False
instrumentation
Instrumentation preferences
- instrumentation.report_each_invocation
Save report for each invocation
Type:
boolDescription: Save an instrumentation report file for each invocation of the SDFG, rather than one report that spans from SDFG initialization to finalization.
Default value:
True
instrumentation.papi
PAPI configuration
- instrumentation.papi.default_counters
Default PAPI counters
Type:
strDescription: Sets the default PAPI counter list, formatted as a Python list of strings.
Default value:
['PAPI_TOT_INS', 'PAPI_TOT_CYC', 'PAPI_L2_TCM', 'PAPI_L3_TCM']
- instrumentation.papi.overhead_compensation
Compensate Overhead
Type:
boolDescription: Subtracts the minimum measured overhead from every measurement.
Default value:
True
- instrumentation.papi.vectorization_analysis
Enable vectorization check
Type:
boolDescription: Enables analysis of gcc vectorization information. Only gcc/g++ is supported.
Default value:
False
library
Settings for handling the use of DaCe libraries.
library.blas
Built-in BLAS DaCe library.
- library.blas.default_implementation
Default implementation
Type:
strDescription: Default implementation for BLAS library nodes.
Default value:
pure
- library.blas.override
Force configured implementation
Type:
boolDescription: Force the default implementation, even if an implementation has been explicitly set on a node.
Default value:
False
library.lapack
Built-in LAPACK DaCe library.
- library.lapack.default_implementation
Default implementation
Type:
strDescription: Default implementation for LAPACK library nodes.
Default value:
OpenBLAS
- library.lapack.override
Force configured implementation
Type:
boolDescription: Force the default implementation, even if an implementation has been explicitly set on a node.
Default value:
False
library.linalg
Built-in NumPy linalg DaCe library.
- library.linalg.default_implementation
Default implementation
Type:
strDescription: Default implementation for linalg library nodes.
Default value:
OpenBLAS
- library.linalg.override
Force configured implementation
Type:
boolDescription: Force the default implementation, even if an implementation has been explicitly set on a node.
Default value:
False
library.pblas
Built-in PBLAS DaCe library.
- library.pblas.default_implementation
Default implementation
Type:
strDescription: Default implementation PBLAS library nodes.
Default value:
MKLMPICH
- library.pblas.override
Force configured implementation
Type:
boolDescription: Force the default implementation, even if an implementation has been explicitly set on a node.
Default value:
False
optimizer
Preferences of the SDFG Optimizer
- optimizer.automatic_simplification
Automatic SDFG simplification
Type:
boolDescription: Automatically performs SDFG simplification on programs.
Default value:
True
- optimizer.autooptimize
Run auto-optimization heuristics
Type:
boolDescription: Automatically runs the set of optimizing transformation heuristics on any program called via the Python frontend.
Default value:
False
- optimizer.autospecialize
Auto-specialize symbols
Type:
boolDescription: Automatically specialize every SDFG to the symbol values at call-time. Requires all symbols to be set.
Default value:
False
- optimizer.autotile_partial_parallelism
Prefer partial parallelism over write-conflict tiling
Type:
boolDescription: If true, sets the auto-optimizer to prefer extracting map parallel dimensions over tiling for atomic write-conflict resolution edges. This may be slower in case of small parallel dimensions vs. conflicted dimensions. This preference only applies to symbolic ranges or ranges over the autotile_size parameter.
Default value:
True
- optimizer.autotile_size
Default tile size in auto-optimization
Type:
intDescription: Sets the default tile size for the optimization heuristics.
Default value:
128
- optimizer.detect_control_flow
Detect control flow from state transitions
Type:
boolDescription: Attempts to infer control flow constructs “if”, “for” and “while” from state transitions, allowing code generators to generate appropriate code.
Default value:
True
- optimizer.match_exception
Treat exceptions in “can_be_applied” as errors
Type:
boolDescription: When an exception is raised in a transformation “can_be_applied” function, if True the exception is raised further. Otherwise the exception is printed as a warning.
Default value:
False
- optimizer.save_intermediate
Save intermediate SDFGs
Type:
boolDescription: Save SDFG files after every transformation.
Default value:
False
- optimizer.symbolic_positive
Treat all symbolic expressions as positive
Type:
boolDescription: Every expression in which a symbolic value appears is treated as strictly positive. This is necessary for certain Range evaluations using Subgraph Fusion.
Default value:
True
- optimizer.visualize_sdfv
Visualize SDFG
Type:
boolDescription: Open an SDFG in browser every transformation.
Default value:
False
testing
Unit testing settings
- testing.deserialize_exception
Treat exceptions in deserialization as errors
Type:
boolDescription: When an exception is raised in a deserialization process (e.g., due to missing library node), by default a warning is issued. If this setting is True, the exception will be raised as-is.
Default value:
False
- testing.serialization
Test Serialization on validation
Type:
boolDescription: Before generating code, verify that a serialization/deserialization loop generates the same SDFG.
Default value:
False
- testing.serialize_all_fields
Serialize all unmodified fields in SDFG files
Type:
boolDescription: If False (default), saving an SDFG keeps only the modified non-default properties. If True, saves all fields.
Default value:
False