The weights for the neural network can be stored in the memory subsystem 504, along with input data 550 on which the neural network will operate. The neural network can also include instructions, which can program the processing engine array 510 to perform various computations on the weights and the input data. The instructions can also be stored in the memory subsystem 504, in the memory banks 514 or in a separate instruction buffer. The processing engine array 510 can output intermediate results, which represent the outputs of individual layers of the neural network. In some cases, the activation engine 516 and/or pooling engine 518 may be enabled for computations called for by certain layers of the neural network. The accelerator engine 502 can store the intermediate results in the memory subsystem 504 for inputting into the processing engine array 510 to compute results for the next layer of the neural network. The processing engine array 510 can further output final results from a last layer of the neural network. The final results can be stored in the memory subsystem 504 and then be copied out to host processor memory or to another location.