When an integrated circuit device includes multiple execution engines, in various examples, the compiler for the device can produce sets of instructions for each execution engine. The instructions for an execution engine can include steps such as reading data from memory of the device, performing a computation on the data, and writing a result of the computation back to the memory of the device. In some examples, the execution engines can independently execute respective sets of instructions, so that the execution engines can operate in parallel.
In some examples, however, the operation of one execution engine may be dependent on the operation of another execution engine. For example, a result computed by one execution engine may be needed as the input of an operation to be performed by a second execution engine. Limitations of the integrated circuit device can also cause dependencies between the execution engines. For example, the device may have a limited amount of memory or a limited number of registers in which inputs for and results from the execution engines can be stored. In this example, one execution engine may need to store a result in a memory location in which the inputs for another execution engine are stored.
When the operations of the execution engines of an integrated circuit device can have dependencies such as those described above, the compiler for the device can capture the dependencies, for example, in a dependency or dataflow graph. In a dataflow graph, nodes in the graph can represent operations or sets of operations to be performed by individual execution engines. The edges or connections between the nodes can represent dependencies between the operations at the nodes.