Herein, the activation engine 516 and the pooling engine 518 may be referred to collectively as execution engines. The processing engine array 510 is another example of an execution engine. Another example of an execution engine is a Direct Memory Access (DMA) engine 570, which may be located outside the accelerator engine 502.
In some implementations, the accelerator engine 502 may include checkpoint registers 560 operable to implement checkpoint synchronization between the execution engines. The number of checkpoint registers 560 available in the accelerator engine 502 to implement checkpoint synchronization may be the same as the number of execution engines. The checkpoint registers 560 may be hardware registers that have the ability to perform atomic set, increment, decrement, and comparison operations. The checkpoint registers 560 may include synchronization logic operable to synchronize execution between the execution engines. The synchronization logic may, among other operations, broadcast a value set by one of the execution engines in one of the checkpoint registers 560 by one of the execution engines to the other execution engines.