Parsing Python Programs to SDFGs
This document describes DaCe’s core Python language parser, implemented by the
ProgramVisitor supports a restricted subset of Python’s features that can be expressed directly as SDFG elements.
A larger subset of the Python language is supported either through code preprocessing and/or in JIT mode.
Supported Python Versions
ProgramVisitor supports exactly the same Python versions as the Data-Centric framework overall: 3.7-3.10.
To add support for newer Python versions, the developer should amend the
to handle appropriately any changes to the Python AST (Abstract Syntax Tree) module. More details can be found in the
official Python documentation.
Classes and object-oriented programing are only supported in JIT mode.
Python native containers (tuples, lists, sets, and dictionaries) are not supported directly as
Data. Specific instances of them may be indirectly supported through code preprocessing. There is also limited support for specific uses, e.g., as arguments to some methods.
Recursion is not supported.
Using NumPy arrays with negative indices (at runtime) to wrap around the array is not allowed. Compile-time negative values (such as -1) are supported.
The ProgramVisitor Class
ProgramVisitor traverses a Data-Centric Python program’s AST and constructs
ProgramVisitor inherits from Python’s ast.NodeVisitor
class and, therefore, follows the visitor design pattern. The developers are encouraged to accustom themselves with this
programming pattern (for example, see Wikipedia and Wikibooks), however, the basic functionality is described below.
An object of the
ProgramVisitor class is responsible for a single
object. While traversing the Python program’s AST, if the need for a
NestedSDFG arises (see Nested ProgramVisitors), a new
ProgramVisitor object will be created to handle the corresponsding Python
Abstract Syntax sub-Tree. The
ProgramVisitor has the following attributes:
filename: The name of the file containing the Data-Centric Python program.
src_line: The line (in the file) where the Data-Centric Python program is called.
src_col: The column (in the line) where the Data-Centric Python program is called.
orig_name: The name of the Data-Centric Python program.
globals: The variables defined in the global scope. Typically, these are modules imported and global variables defined in the file containing the Data-Centric Python program.
closure: The closure of the Data-Centric Python program.
nested: True if generating a
simplify: True if the
simplfy()should be called on the generated
scope_vars: The variables defined in the parent
variables: The variables defined in the current
views: A dictionary of Views and the Data subsets viewed. Used to generate Views for Array slices.
nested_closure_arrays: The closure of nested Data-Centric Python programs.
annotated_types: A dictionary from Python variables to Data-Centric datatypes. Used when variables are explicitly type-annotated in the Python code.
sdfg: The generated
current_lineinfo: The current
DebugInfo. Used for debugging.
modules: The modules imported in the file of the top-level Data-Centric Python program. Produced by filtering globals.
loop_idx: The current scope-depth in a nested loop construct.
symbols: The loop symbols defined in the
SDFGobject. Useful for memlet/state propagation when multiple loops use the same iteration variable but with different ranges.
indirections: A dictionary from Python code indirection expressions to Data-Centric symbols.
The ProgramVisitor and the Visitor Design Pattern
takes as input a Data-Centric Python program’s AST (ast.FunctionDef object).
It then iterates over and visits the statements in the program’s body. The Python call tree when visiting a statement is approximately as follows:
In the above fourth call, Class in visit_Class is a placeholder for the name
of one of the Python AST module class supported by the ProgramVisitor.
For example, if the statement is an object of the ast.Assign
visit_Assign() method will be invoked.
Each object of a Python AST module class (called henceforth AST node) typically
has as attributes other AST nodes, generating tree-structures. Accordingly, the
corresponding ProgramVisitor methods perform some action for the parent AST node
and then recusively call other methods to handle the children AST nodes until
the whole tree has been processed. It should be mentioned that, apart from the
class-specific visitor methods, the following may also appear in the Python call tree:
Nested ProgramVisitors and NestedSDFGs
When parsing a call (see ast.Call) to another Data-Centric Python program or an
SDFGobject. It should be noted that calls to, e.g., supported NumPy methods (see
replacements), may also (eventually) trigger the generation of a
NestedSDFG. However, this is mostly occuring through Library Nodes.
Below follows a list of all AST class-specific
ProgramVisitor’s methods and a short description of
of which Python language features they support and how:
Parses functions decorated with one of the following decorators:
The Data-Centric Python frontend does not allow definition of Data-Centric Python programs inside another one.
This visitor will catch such cases and raise
Parses for statements using one of the following iterators:
range: Results in a (sequential) for-loop.
parrange(): Results in uni-dimensional
@dace.program def for_loop(A: dace.int32): for i in range(0, 10, 2): A[i] = i
Parses while statements. Example:
@dace.program def while_loop(): i = 10 while i > 0: i -= 3
Parses break statements. In the following example, the for-loop behaves as an if-else statement. This is also evident from the generated dataflow:
@dace.program def for_break_loop(A: dace.int32): for i in range(0, 10, 2): A[i] = i break
Parses continue statements. In the following example, the use
of continue makes the
A[i] = i statement unreachable. This is also evident from the generated dataflow:
@dace.program def for_continue_loop(A: dace.int32): for i in range(0, 10, 2): continue A[i] = i
Parses if statements. Example:
@dace.program def if_stmt(a: dace.int32): if a < 0: return -1 elif a > 0: return 1 else: return 0
Allows parsing of PEP 572 assignment expressions (Warlus operator), e.g.,
n := 5.
However, such expressions are currently treated by the
ProgramVisitor as simple assignments.
In Python, assignment expressions allow assignments within comprehesions. Therefore, whether an assignment expression
will have the Python-equivalent effect in a Data-Centric Python program depends on the
support for those complehensions.
Parses assignment statements. Example:
@dace.program def assign_stmt(): a = 5
Parses annotated assignment statements. The
respects these type annotations and the assigned variables will have the same (DaCe-compatible) datatype as if the code
was executed through the CPython interpreter.
Parses augmented assignments statements. The
will try to infer whether the assigned memory location is read and written by a single thread. In such cases, the
assigned memory location will appear as both input and output in generated subgraph. Otherwise, it will appear only as
output and the corresponding edge will have write-conflict resolution (WCR). Example:
@dace.program def augassign_stmt(): a = 0 for i in range(10): a += 1 for i in dace.map[0:10]: a += 1
Parses function call statements. These statements may call any of the following:
Another Data-Centric Python program: Execution is transferred to a nested
A supported Python builtin or module (e.g., NumPy) method: Execution is transferred to the corresponding replacement method (see
An unsupported method: Generates a callback to the CPython interpreter.
Parses return statements.
Parses string constants. DEPRECATED in Python 3.8 and newer versions.
Parses numerical constants. DEPRECATED in Python 3.8 and newer versions.
Parses all constant values.
Parses names, e.g., variable names.
Parses name constants. DEPRECATED in Python 3.8 and newer versions.
Generates a string representation of a lambda function.
Parses unary operations.
Parses binary operations.
Parses boolean operations.
Parses index expressions in subscripts. DEPRECATED.
Parses slice expressions in subscripts. DEPRECATED.