Configuring DaCe
Various aspects of DaCe can be configured. When first run, the framework creates a file called .dace.conf
. The
file is written in YAML format and provides useful settings that can be modified either directly or overridden on a
case-by-case basis using the configuration API or environment variables.
Note
Documentation for all configuration entries is available at the Configuration entry reference.
DaCe will first try to search for the configuration file in the DACE_CONFIG
environment variable, if exists.
Otherwise, it will then look for a .dace.conf
file in the current working directory. If not found,
it will look for it in the user’s home directory. By default, if no file can be found a new one will be created in
the home directory. If the home directory does not exist (e.g., in Docker containers), the file will be created in the
current working directory. If no configuration file can be created in any of the above paths, the default settings are used.
An example configuration file, which changes two configuration entries, looks as follows:
compiler:
cuda:
default_block_size: 64,8,1 # Change GPU map block size
debugprint: true # Add more verbosity in printouts
When compiling programs, the configuration used to build it will also be saved along with the binary in the
appropriate .dacecache
folder. The configuration file in that folder contains all configuration entries, not
just the ones changed from default, for reproducibility purposes.
Changing configuration entries via environment variables
Any configuration entry can be overridden using environment variables. To do so, create a variable that starts with
DACE_
followed by the configuration entry path. Dot (.
) characters should be replaced with _
.
For example, setting the CPU compiler path (compiler.cpu.executable
) with an environment variable can be
done as follows:
$ export DACE_compiler_cpu_executable=/path/to/clang++
$ python my_program_with_clang.py
Getting/setting configuration entries via the API
Within DaCe, obtaining or modifying configuration entries is performed by accessing the dace.config.Config
singleton.
Get and set values with get()
and set()
.
For boolean values, use get_bool()
to convert more options (e.g., 1
, True
, yes
) to
booleans. If the setting is in a hierarchy, pass it as separate arguments. Examples include:
from dace.config import Config
print('Synchronous debugging enabled:', Config.get_bool('compiler', 'cuda', 'syncdebug'))
Config.set('frontend', 'unroll_threshold', value=11)
We also provide a context manager API to temporarily change the value of a configuration (useful, for example, in unit tests, where configuration changes must not persist outside of a test):
# Temporarily enable profiling for one call
with dace.config.set_temporary('profiling', value=True):
dace_laplace(A, args.iterations)
Deciding the value of a configuration entry
If an entry is defined in multiple places, the priority order for determining the value is as follows:
If a
DACE_*
environment variable is found, its value will always be usedOtherwise, the API (
set()
,set_temporary()
) is usedValue located in a
.dace.conf
file in the current working directoryLastly, the value will be searched in
.dace.conf
located in the user’s home directory or the path pointed to by theDACE_CONFIG
environment variable
Useful configuration entries
General configuration:
debugprint
: Print debugging information. If set to"verbose"
, prints more debugging information.
compiler.use_cache
: Uses DaCe program cache instead of recompiling programs. Also useful for debugging code generation (see Debugging Code Generation).
compiler.default_data_types
: Chooses default types for integer and floating-point values. IfPython
is chosen,int
andfloat
are both 64-bit wide. IfC
is chosen,int
andfloat
are 32-bit wide.
optimizer.automatic_simplification
: If False, skips automatic simplification in the Python frontend (see Simplify Pipeline for more information).
Profiling:
profiling
: Enables profiling measurement of the DaCe program runtime in milliseconds. Produces a log file and prints out median runtime. See Profiling and Instrumentation for more information.
treps
: Number of repetitions to run when profiling is enabled.
GPU programming and debugging:
compiler.cuda.backend
: Chooses the GPU backend to use (can becuda
for NVIDIA GPUs orhip
for AMD GPUs).
compiler.cuda.syncdebug
(default: False): If True, calls device-synchronization after every GPU kernel and checks for errors. Good for checking crashes or invalid memory accesses.
FPGA programming:
compiler.fpga.vendor
: Can bexilinx
for Xilinx FPGAs, orintel_fpga
for Intel FPGAs.