`) and can be written to at any given point.
This means that Scalars cannot be used in symbolic expressions, as their value may change, or not be accessible altogether.
In contrast, symbols are always accessible on each device (the code generator ensures this).
.. raw:: html
In general, using a symbol is always preferable if: (a) its final value is not needed outside the SDFG; (b) its value is
necessary to e.g., specify a memlet index; and (c) it is not written to by a computation. The last condition can be
worked around if the Scalar is on the host memory using a state transition (see figure on the right, the assignment
takes the value of ``scal`` and assigns it to ``sym``, which can be used in subsequent states).
During :ref:`simplification `, the :class:`~dace.transformation.passes.scalar_to_symbol.ScalarToSymbolPromotion`
pass tries to convert Scalars to symbols, if they fulfill all the constraints of a symbol.
Symbols that are defined outside a state can also be given to an SDFG as parameters. This is used when data containers
have symbolic sizes. We say that a symbol that does not have a defined value is a *free symbol*. The free symbols of an
SDFG have to be added to the symbol store (``sdfg.symbols``) using :func:`~dace.sdfg.sdfg.SDFG.add_symbol`.
DaCe also provides a special kind of symbol called :class:`~dace.symbolic.UndefinedSymbol` that represents values which are
undefined and deferred to runtime. This allows symbolic analysis to continue even when some data container dimensions
or other symbolic values cannot be determined at analysis time.
Similar to NaN behavior in floating-point arithmetic, any operation involving an :class:`~dace.symbolic.UndefinedSymbol`
results in another :class:`~dace.symbolic.UndefinedSymbol`. Comparisons with undefined symbols yield indeterminate
results. During code generation, an informative exception is raised if an :class:`~dace.symbolic.UndefinedSymbol` is
encountered, providing clear error messages that point to the undefined symbol.
The Python frontend provides syntactic sugar for creating undefined symbols by using the string ``"?"`` in array
dimensions:
.. code-block:: python
import dace
import numpy as np
@dace.program
def example_with_undefined(A: dace.float64["?"], B: dace.float64[20]):
# Only access the first 10 elements of A, so the undefined dimension isn't used
B[:10] = A[:10]
# This compiles successfully since the undefined dimension is not accessed
a = np.random.rand(100) # Actual size doesn't matter
b = np.random.rand(20)
example_with_undefined(a, b)
.. _connectors:
Connectors
~~~~~~~~~~
.. figure:: images/connectors.svg
:figwidth: 40%
:width: 100%
:align: right
:alt: Connector types
Connector types.
As SDFG states are acyclic multigraphs (where two nodes can have more than one edge between them), every edge needs a
port on the source/destination nodes to connect with. In SDFGs, we use *connectors* for this purpose. There are two types
of connectors: **view** (colored in cyan) and **passthrough** (colored in transparent turquoise). The former is used to specify
an endpoint on nodes, upon which the connector name can be used within that node (e.g., tasklet, nested SDFG, map entry
for dynamic map ranges). The latter passes through a scope node (such as a map) and allows DaCe to track the path of
memlets through that scope. An example of both is shown on the right.
A view connector does not need to define a data container. This is because connectors are references that take on the shape
of the memlet connected to it. However, connectors can have types of their own. By default, the type of a connector is
``None``, which means its type is inferred automatically by DaCe in :func:`~dace.sdfg.infer_types.infer_connector_types`.
If an type is defined, it acts as a "cast" of the data it is referring to. This is used, among other places, in SIMD
vectorization. For example, an access :pycode:`A[4*i : 4*i + 4]` connected to a connector of type :pycode:`dace.vector(dace.float64, 4)`
will reinterpret the data as a 4-element vector.
View connectors attached to tasklets are collapsed to the underlying array view, similarly to how NumPy behaves. That
means that if a memlet pointing to a tasklet is :pycode:`A[i, j, k, l]` and the connector is called :pycode:`a`, :pycode:`a`
would be accessed as a scalar. If the memlet has non-scalar ranges (whether they end up being size 1 or larger), e.g.,
:pycode:`A[i, j:j+k, 0, 0:L]`, the connector in the tasklet must be accessed as a two-dimensional array. In the example,
:pycode:`a[m, n]` in the tasklet would correspond to accessing :pycode:`A[i, j+m, 0, n]`.
Passthrough connectors are identified only by name: the incoming connector must start with ``IN_`` and outgoing connector
must start with ``OUT_``. Passthrough connectors with matching suffixes (e.g., ``IN_arr`` and ``OUT_arr``) are considered
part of the same *memlet path* (highlighted in orange in the above figure, see :ref:`below ` for more details).
Connectors cannot be dangling (without any connecting edge), and view connectors must only have one connected edge. Other
cases will fail validation, with dangling connectors marked in red upon display.
.. _sdfg-memlet:
Memlets
~~~~~~~
Memlets represent data movement in SDFGs. They can connect access nodes with other access nodes (creating a
read *and* a write operation), access nodes with :ref:`view connectors ` (creating a read *or* a write
operation, depending on the direction), or directly connect a tasklet to a tasklet or a view connector (creating both
a read and a write). Several fields describe the data being moved:
* ``data``: The data container being accessed.
* ``subset``: A :class:`~dace.subsets.Subset` object that represents the part of the container potentially being moved.
* ``volume``: The number of elements moved (as a symbolic expression). The ``dynamic`` boolean property complements
this --- if set to True, we say the number of elements moved is *up to* the value of ``volume``.
* If ``volume`` is set to ``-1`` and ``dynamic`` is True, the memlet is defined as unbounded, which means there is
no way to analyze how much (and when) data will move.
* ``wcr`` (default None): An optional lambda function that specifies the Write-Conflict Resolution function.
If not None, every movement into the destination container will instead become an update (see below). The first argument
of the lambda function is the old value, whereas the second is the incoming value.
* ``other_subset``: Typically, only ``subset`` is necessary for memlets that connect access nodes with view connectors.
For other cases (for example, access node to access node), the other subset can be used to offset the
sub-region of the container *not* named in ``data``.
* For example, the copy :pycode:`B[1:21, 0:3] = A[i:i+20, j-1:j+2]` can be represented by
:pycode:`dace.Memlet(data='A', subset='i:i+20, j-1:j+2', other_subset='1:21, 0:3')`.
*For performance reasons, always prefer constructing range subsets from* :class:`~dace.subsets.Range` *over a string.*
* The alias properties ``src_subset`` and ``dst_subset`` specify the source and destination subsets, regardless of the
value of ``data``.
There are more properties you can set, see :class:`~dace.memlet.Memlet` for a full list.
Memlet subsets and volumes are used for analyzing (or estimating, if dynamic) data movement patterns and costs.
A memlet's ``subset`` does not necessarily mean that all values in that subset would be read at runtime. It rather acts as a
*constraint* on the potential accessed values. This can be used to ensure certain memory is accessible after some optimizations
Together with ``volume``, we can define important memory access patterns, such as indirect access:
.. code-block:: python
# ...
for i in dace.map[0:N]:
mileage += distances[destinations[i]]
# ...
In the above, the memlets inside the map are :pycode:`dace.Memlet(data='destinations', subset='i', volume=1)` and
:pycode:`dace.Memlet(data='distances', subset='0:20', volume=1)`. Notice that in the latter, we know *one* element would
be read in the range ``0:20``, but we do not know which one it is going to be.
.. figure:: images/memlet-scopes.svg
:name: memtree
:figwidth: 40%
:width: 100%
:align: right
:alt: Memlet paths and trees.
Memlet paths and trees.
**Memlet paths**: When using scopes, such as maps, we can have multiple memlets connected to the same connector, and
through multiple levels of scopes. In order to trace and distinguish between them, we use *memlet paths* (green highlighted
edges in the figure to the right). A memlet path is a sequence of memlets that connect between two view connectors.
We can obtain a memlet path by calling the :func:`~dace.sdfg.state.StateGraphView.memlet_path` on an SDFG state with the
edge.
The memlet path is actually part of a **memlet tree** (highlighted in orange), since in each scope a memlet
can split into multiple memlets. To obtain the tree, use :func:`~dace.sdfg.state.StateGraphView.memlet_tree`. Its *root*
is the highest-scope memlet (bottom or top, depending on the direction of the path), and its *leaves* are the lowest-scope
memlets (connected to the tasklet in the figure). Changing a data descriptor often requires changing the data names on every
memlet in the memlet tree.
In code generation, memlets emit reference/copy code at the memlet tree leaf level (i.e., in the innermost scope),
because of the parametric replication aspect (in the "expanded" graph, every memlet path is a single edge connecting to
a copy of the internal node).
**Empty memlets** are memlets that carry no data. They are used to lock a node into a scope, without actually moving
data into or out of it. For example, when zeroing an array, the tasklet ``a = 0`` has no inputs, but has to be inside a map.
It is thus connected to the map entry node with an empty memlet. You can check if a memlet is empty using the :func:`~dace.memlet.Memlet.is_empty`
method.
In **write-conflict resolution (WCR) memlets**, the WCR function itself may have properties (such as commutativity).
The WCR expression is symbolically analyzed (using SymPy) to determine such properties, and even finds common reduction
functions (such as summation or product), replacing it with a built-in operator (e.g., ``a + b`` becomes
:class:`dace.dtypes.ReductionType.Sum`). If it can detect a :class:`~dace.dtypes.ReductionType`, it knows more properties
about the function and can use fast version in libraries (for example, ``MPI_SUM`` in MPI). Note that Streams cannot use
WCR.
WCR updates can be implemented in a platform-dependent way, for example using atomic operations, one-sided communication,
or different kinds of accumulators on FPGAs. It is thus important to differentiate updates from a simple read+write, since
it can yield faster, more analyzable code.
Like copies, WCR memlets are applied on the innermost scope of the memlet tree. This requires to exercise caution when
using such memlets, as they can lead to excessive atomic operations or critical sections. There are, however, transformations
that can be applied on the SDFG to change the schedule and, for example, add a buffer to accumulate locally and save on
expensive update operations (see :class:`~dace.transformation.dataflow.stream_transient.AccumulateTransient`).
.. _sdfg-map:
Parametric Parallelism
~~~~~~~~~~~~~~~~~~~~~~
DaCe supports parametric parallelism, which is a form of parallelism that is defined by symbolic expressions.
To represent parallel sections, we use graph scopes, which are defined by *Entry* and *Exit* nodes (as shown in
:numref:`scopefig`). The parametric ranges themselves are represented by symbolic integer sets (using the
same :class:`~dace.subsets.Subset` class used for memlets). There are several scopes provided in DaCe, the most common one
being :class:`~dace.sdfg.nodes.Map` and the second is :class:`~dace.sdfg.nodes.Consume`.
**Maps** represent a simple form of replication, where the scope is replicated for each value in the range. For example,
``dace.map[0:N, 1:M-N+1]`` is a parametrically-parallel scope with two dimensions, the first ranging from ``0`` to ``N-1``,
and the second ranging from ``1`` to ``M-N`` (inclusive). This means that the scope subgraph will be replicated for this
number of times. The ``params`` property of the map specifies the names that parameterize the map range, and must match
the given ``range`` property.
Maps can be *scheduled* to have their contents executed in different locations. For example, a map can be
scheduled to run on a multi-core CPU (with :class:`~dace.dtypes.ScheduleType.CPU_Multicore`), which will generate a
``for`` loop for each instance of the subgraph, using OpenMP to control parallelism. Maps can also be scheduled to run
on a GPU (with :class:`~dace.dtypes.ScheduleType.GPU_Device`), or other devices, see :class:`~dace.dtypes.ScheduleType`
for all built-in types. The default schedule is :class:`~dace.dtypes.ScheduleType.Default`, which means that the
scheduler will decide the schedule for the map based on its context. For example, if the map is already situated inside
another multi-core map, the map will be scheduled to a single core using :class:`~dace.dtypes.ScheduleType.Sequential`.
Maps have other properties that further control their behavior. For example, their exact schedule
(unrolling, loop collapsing) can be controlled with certain attributes, such as ``unroll`` and ``gpu_block_size``.
**Consume** scopes are similar to maps, but are used to consume data from a stream. Their processing elements (``num_pes``)
is a symbolic expression that governs the degree of parallelism during the scope. ``pe_index`` acts as a "thread index"
variable. The quiescence ``condition`` property can be used to control when to end the scope's execution (by default,
if ``condition`` is not given, the scope will run until the stream is empty). The ``chunksize`` property specifies
the granularity of the stream consumption (how many elements to consume at once).
.. figure:: images/scope-tree.svg
:figwidth: 30%
:width: 100%
:align: right
:alt: Scope trees.
SDFG scopes in :numref:`memtree`.
**Navigating in scopes**: Like memlets, scopes in an SDFG state form a tree. The root of the tree is the top-level scope,
i.e., the one that is not contained in any other scope. You can navigate the scope tree as a whole using the
:func:`~dace.sdfg.state.StateGraphView.scope_tree` method. You can also navigate the scope tree of a specific node
by using the :func:`~dace.sdfg.state.StateGraphView.scope_children` method to get the child nodes of the specified
scope, or :func:`~dace.sdfg.state.StateGraphView.scope_dict` to get the parent-pointing dictionary of each node to its
parent. This is visualized in the figure on the right.
To easily get the scope of a node, you can use the :func:`~dace.sdfg.state.StateGraphView.scope_subgraph` method,
which will return the subgraph of only the nodes contained in the requested scope. To jump to the entry node of the current
scope, use the :func:`~dace.sdfg.state.StateGraphView.entry_node` method. Similarly, to get the exit node of the
current entry node, use the :func:`~dace.sdfg.state.StateGraphView.exit_node` method.
**Dynamic Map Ranges**: Such ranges can use memlets to define the map ranges directly from data containers, while
still retaining the dataflow of a single state. As they are fed into a view connector on the map entry node, their value
(described by the connector name) can be used in the symbolic expressions of the map range, and anywhere inside the map
scope as a symbol (same as the iteration variables). Only scalar connectors are allowed.
In the following example, we use dynamic map ranges to compute a sparse matrix-vector multiplication,
where the vector is dense. Every output row has a defined range (standard, symbolic map), whereas the corresponding rows
of the input sparse matrix have a dynamic length (dynamic-range.)
.. raw:: html
.. code-block:: python
@dace.program
def spmv(A_row: dace.uint32[H + 1], A_col: dace.uint32[nnz],
A_val: dace.float32[nnz], x: dace.float32[W],
b: dace.float32[H]):
for i in dace.map[0:H]:
for j in dace.map[A_row[i]:A_row[i + 1]]:
with dace.tasklet: # Explicit dataflow syntax
aval << A_val[j]
xval << x[A_col[j]]
out = aval * xval
out >> b(1, lambda a,b: a+b)[i] # WCR output
spmv.to_sdfg().view()
.. _viewref-lang:
Views and References
~~~~~~~~~~~~~~~~~~~~
Similarly to a NumPy view, a *View* data container is created whenever an array is sliced, reshaped, reinterpreted,
or by other NumPy functions (e.g., with a ``keepdims`` argument).
A view never creates a copy, but simply reinterprets the properties (shape, strides, type, etc.) of the viewed container.
In C it would be equivalent to setting a pointer to another pointer with an offset (that is also how the code generator
implements it). Views are useful for performance, reducing memory footprint, and to modify sub-arrays in meaningful ways,
for example solving a system of equations on matrices made from a long 1-dimensional vector.
.. figure:: images/views.svg
:figwidth: 50%
:align: right
:alt: Chained views example.
Chained views and the corresponding generated code. A column of ``A`` is turned into a matrix, which is
viewed again at a sub-column. Each data descriptor must set the appropriate strides, as the viewed
memlet is only used to offset a pointer.
A view is always connected to another data container from one side, and on the other it is used just like a normal access node.
In case of ambiguity, the ``views`` connector points to the data being viewed. You can even chain multiple views
together using this connector (see figure on the right).
Another place where views can be found is in Nested SDFG connectors: any data container in a nested SDFG is in fact a
*View* of the memlet going in or out of it.
As opposed to Views, *References* can be set in one place (using the ``set`` connector), and then be used as normal access nodes
later anywhere else. They can be set to different containers (even more than once), as long as the descriptors match (as
with views). References are useful for patterns such as multiple-buffering (``a, b = b, a``) and polymorphism.
.. warning::
Since references detach the access node from the data container, it inhibits the strength of data-centric analysis for
transformations and optimization. Frontends generate them only when absolutely necessary.
Therefore, **use references sparingly**, or not at all if possible.
An example use of references can be created with the following code:
.. raw:: html
.. code-block:: python
i = dace.symbol('i')
@dace
def refs(A, B, out):
if i < 5:
ref = A
else:
ref = B
out[:] = ref
refs.to_sdfg(a, b, out).view()
.. _libnodes:
Library Nodes
~~~~~~~~~~~~~
For certain functions and methods, there are high-performance or platform-specific versions that can be beneficial.
For example, the ``numpy.dot`` function can be implemented using a matrix multiplication, which in turn can be implemented
by a fast Basic Linear Algebra Subprogram (BLAS) libraries, such as MKL or CUBLAS.
DaCe provides the capability to use such libraries by using *Library Nodes*, which can be used for different purposes:
* **Specialization**: A library node can be used to specialize a generic node to a specific implementation.
* **Coarse-grained transformations**: A library node can be used to implement a transformation that is not possible
with the generic node, i.e., relying on the semantics of the node. For example, fusing a matrix multiplication with
a transposition operation.
Specializing library nodes to a specific implementation is called **expansion**. Expansions are user-extensible and can
be added to a DaCe library or to a specific SDFG. Built-in expansions and library nodes are defined in the ``dace.libraries``
module, and are automatically loaded when the library is imported.
Library-specific transformations also exist in the standard set of transformations, for example
:class:`~dace.transformation.dataflow.matrix_product_transpose.MatrixProductTranspose`.
During compilation or optimization, a library node will be expanded using the ``expand`` method of the chosen implementation.
If no implementation is chosen, the ``default_implementation`` field of the library node class will be chosen. One can
override default implementations for a library node type, or for an entire library. This can be used, for example, to set
:ref:`BLAS ` to default to a certain library:
.. code-block:: python
from dace.libraries import blas
@dace
def mult(a, b):
return a @ b
mult(a, b) # will use the default implementation in the config file ("pure")
blas.default_implementation = 'MKL'
@dace
def mult2(a, b):
return a @ b
mult2(a, b) # will use Intel MKL
Internally, an expansion is a subclass of :class:`~dace.transformation.transformation.ExpandTransformation`. It is
responsible for creating a new SDFG that implements the library node, and for connecting the inputs and outputs of the
library node to the new SDFG. An example of such an expansion is Einstein summation specialization
(`see full file `_):
.. code-block:: python
from dace import SDFG, SDFGState, library, nodes, properties
from dace import transformation as xf
# Define the library node itself
@library.node
class Einsum(nodes.LibraryNode):
# Set the default expansion of the node to 'specialize' (registered below)
implementations = { ... }
default_implementation = 'specialize'
# Configurable properties of the einsum node go here
...
...
# Define the expansion, which specializes the einsum by lowering it to either
# a BLAS operation or a direct contraction
@library.register_expansion(Einsum, 'specialize')
class SpecializeEinsum(xf.ExpandTransformation):
# Define environments necessary for this expansion (optional, can be an empty list)
environments = []
# The following method returns the SDFG that results from expanding the library node.
# Upon expansion, DaCe will insert the returned SDFG into the graph as a nested SDFG
# node (which can be inlined).
@staticmethod
def expansion(node: Einsum, parent_state: SDFGState, parent_sdfg: SDFG) -> SDFG:
# Make an SDFG for the expansion
sdfg = SDFG('einsum')
...
return sdfg
.. _memprop:
Memlet Propagation
------------------
In order to perform transformations, DaCe needs to know the shape and type of memlets at all scopes. For example,
if only a single column of a matrix is accessed in a map, this could be used to optimize memory transfers. Therefore,
DaCe performs *memlet propagation* to infer the shape and type of memlets at all scopes. It is a process that starts
from the memlets on the innermost scopes, and propagates outwards through scope entry/exit nodes.
Formally, the memlet that goes into a scope contains the union of the subsets of all the internal memlets, and its volume
is the sum of the volumes of all the internal memlets. This means that we must project the space of the parametric scope
onto the internal, parametric memlet. For example, the memlet ``A[i, j:j+10:2]`` inside a map ranged ``[i=1:N, j=0:25]``
would be propagated to ``A[1:N, 0:35:2]``, and the memlet ``B[2*i + 1]`` would be propagated to ``B[3:2*N:2]`` within
the same map.
Memlet propagation also works through nested SDFGs, by propagating the potential value ranges each symbol may take.
Such propagation is the equivalent of scope propagation for ``for`` loops and general symbolic values. The following
figure shows an example of this:
.. figure:: images/memprop.png
:align: center
:width: 70%
Memlet propagation through a nested SDFG with a loop.
DaCe uses the SymPy symbolic engine to perform this projection, as well as to union
multiple internal memlets for propagation. The projection is performed in the :mod:`~dace.sdfg.propagation` module, using
pattern matching on symbolic expressions. The :class:`~dace.sdfg.propagation.MemletPattern` class is used to define
such patterns, and the function :func:`~dace.sdfg.propagation.propagate_subset` performs the propagation itself.
The process is triggered automatically by the Python frontend. If you want to trigger it manually on an entire SDFG, call
:func:`~dace.sdfg.propagation.propagate_memlets_sdfg`. Note, however, that this may be a slow process depending on the
size of the SDFG. To propagate a single memlet, use :func:`~dace.sdfg.propagation.propagate_memlet`, and for a single scope
use :func:`~dace.sdfg.propagation.propagate_memlets_scope`. If you only want to trigger the part that propagates
symbol values across the SDFG state machine, call :func:`~dace.sdfg.propagation.propagate_states`.
.. _sdfg-api:
SDFG Builder API
----------------
When writing frontends for DaCe and when transforming SDFGs, it is necessary to interact with the SDFG directly.
The SDFG builder API provides a set of functions that allow the user to create SDFGs programmatically. The API is
similar to graph manipulation APIs, such as `NetworkX `_ (in fact, the SDFG is internally
represented as an ordered NetworkX graph).
The entire API is organized around methods of the :class:`~dace.sdfg.sdfg.SDFG` and :class:`~dace.sdfg.state.SDFGState`
classes. Additionally, helper functions in :mod:`dace.sdfg.utils` and :mod:`dace.transformation.helpers` can be used
to perform common operations, such as traversal, querying, analysis, graph nesting, and many others.
**Adding/removing elements**: The two classes contain ``add_*`` methods for adding nodes and edges, and ``remove_*`` methods for
removing them. Examples include :meth:`~dace.sdfg.state.SDFGState.add_tasklet` and :meth:`~dace.sdfg.sdfg.SDFG.add_edge`.
Other helper methods can compose these operations, such as :meth:`~dace.sdfg.state.SDFGState.add_mapped_tasklet`, which adds
a tasklet inside a map; and :meth:`~dace.sdfg.state.SDFGState.add_memlet_path` which adds a path of memlets between an
arbitrary number of scopes, :ref:`propagating ` them along the way.
**Connectors and edges**: The methods ``add_{in,out}_connector`` on subclasses of :class:`~dace.sdfg.nodes.Node` add
appropriate connectors (view or passthrough) to the node. The :meth:`~dace.sdfg.state.SDFGState.add_edge` method of
:class:`~dace.sdfg.state.SDFGState` receives two nodes and two connector names, and creates an edge between them.
:meth:`~dace.sdfg.graph.OrderedMultiDiConnectorGraph.add_nedge` (no-connector edge) acts as a shorthand for edges with
``None`` connectors on either side.
**Enumeration**: The API can enumerate the specific nodes and edges of a graph via the ``nodes()`` and ``edges()`` methods.
The interface also contains methods for querying the graph, such as :meth:`~dace.sdfg.sdfg.SDFG.all_nodes_recursive`, which is
useful to obtain all nodes, including states and nodes of nested SDFGs (similarly, :meth:`~dace.sdfg.sdfg.SDFG.all_edges_recursive`
and :meth:`~dace.sdfg.sdfg.SDFG.all_sdfgs_recursive` query edges and SDFGs, respectively). Navigating within states can
be performed using the :ref:`scope methods `.
**Subgraphs**: The API allows the user to obtain subgraphs of the SDFG, such as the :meth:`~dace.sdfg.state.StateGraphView.scope_subgraph`
method, which returns a subgraph of the state that contains all nodes within a given scope. In general, the
:class:`~dace.sdfg.graph.SubgraphView` class can be used to obtain a view of a subgraph. It filters out nodes and edges
that are not part of the subgraph, and can be used to perform operations on the subgraph, such as traversing it.
**Data descriptors**: Methods such as :meth:`~dace.sdfg.sdfg.SDFG.add_array` and :meth:`~dace.sdfg.sdfg.SDFG.add_stream`
can be used to add data descriptors to the SDFG. Any subclass of the :class:`~dace.data.Data` base class
can be added to the SDFG using the :meth:`~dace.sdfg.sdfg.SDFG.add_datadesc` method.
**Traversal**: Since nodes and edges are stored in arbitrary order, the API provides methods for traversing the graph
by topological order. The method :func:`~dace.sdfg.utils.dfs_topological_sort` returns a list of nodes in a state, and
:func:`~dace.sdfg.analysis.cfg.blockorder_topological_sort` traverses the state machine in approximate order of execution
(i.e., preserving order and entering if/for scopes before continuing).
What to Avoid
-------------
While SDFGs are Turing complete, they are not meant to be used to represent any program concisely.
Concepts such as parametric-depth recursion are control-centric (and not very portable). In order to represent them,
a stack would need to be simulated in the representation, which will be slow. The frontends avoid parametric-depth
recursion by creating a callback to the original runtime (e.g., the Python interpreter) instead of parsing it into an
SDFG.
Another example is dynamic memory. Dynamically-allocated arrays are supported, so long as their sizes can be expressed
as symbolic expressions. However, this would create another state in the state machine (see the example in :ref:`sdfg-symbol`),
which might hamper dataflow analysis. In general, it is better to use constant or symbolically-sized memory allocation.
Lastly, complex pointer arithmetic and referencing different types of arrays with the same Reference are not supported.
Frontends will try to mitigate this by creating multiple Reference data descriptors, but references in general
inhibit provenance tracking of arrays, and many optimizations would not apply.
In many of the above cases, the frontends will emit a warning to the user, and unsupported behavior will be encapsulated
as callbacks. It is important to pay attention to these warnings, as they might contribute to performance issues
later on.
.. _format:
``.sdfg`` File Format
---------------------
An SDFG file is a JSON file that contains all the properties of the graph's elements. See :ref:`properties` for more
information about how those are saved.
You can save an SDFG to a file in the SDFG API with the :func:`~dace.sdfg.sdfg.SDFG.save` method. Loading an SDFG from a
file uses the :func:`~dace.sdfg.sdfg.SDFG.from_file` static method. For example, in the following save/load roundtrip:
.. code-block:: python
@dace.program
def example(a: dace.float64[20]):
return a + 1
sdfg = example.to_sdfg() # Create an SDFG out of the DaCe program
sdfg.save('myfile.sdfg') # Save
new_sdfg = dace.SDFG.from_file('myfile.sdfg') # Reload
assert sdfg.hash_sdfg() == new_sdfg.hash_sdfg() # OK, SDFGs are the same
The ``compress`` argument can be used to save a smaller (``gzip`` compressed) file. It can keep the same extension,
but it is customary to use ``.sdfg.gz`` or ``.sdfgz`` to let others know it is compressed.
It is recommended to use this option for large SDFGs, as it not only saves space, but also speeds up loading and
editing of the SDFG in visualization tools and the VSCode extension.