^{<thead id="hvzqa"></thead>}

Synchronization of concurrent computation engines

專利號

US11175919B1

公開日期

2021-11-16

申請人

Amazon Technologies, Inc.（US WA Seattle）

發(fā)明人

Ilya Minkin; Ron Diamant; Drazen Borkovic; Jindrich Zejda; Dana Michelle Vantrease

IPC分類

G06F9/30; G06F9/35; G06F13/28; G06F9/38; G06F9/52; G06N3/06

技術領域

checkpoint,engine,execution,register,ckpt1,engines,in,wait,value,can

地域： WA WA Seattle

摘要

Integrated circuit devices and methods for synchronizing execution of program code for multiple concurrently operating execution engines of the integrated circuit devices are provided. In some cases, one execution engine of an integrated circuit device may be dependent on the operation of another execution engine of the integrated circuit device. To synchronize the execution engines around the dependency, a first execution engine may execute an instruction to set a value in a register while a second execution engine may execute an instruction to wait for a condition associated with the register value.

說明書

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73

The weights for the neural network can be stored in the memory subsystem 504, along with input data 550 on which the neural network will operate. The neural network can also include instructions, which can program the processing engine array 510 to perform various computations on the weights and the input data. The instructions can also be stored in the memory subsystem 504, in the memory banks 514 or in a separate instruction buffer. The processing engine array 510 can output intermediate results, which represent the outputs of individual layers of the neural network. In some cases, the activation engine 516 and/or pooling engine 518 may be enabled for computations called for by certain layers of the neural network. The accelerator engine 502 can store the intermediate results in the memory subsystem 504 for inputting into the processing engine array 510 to compute results for the next layer of the neural network. The processing engine array 510 can further output final results from a last layer of the neural network. The final results can be stored in the memory subsystem 504 and then be copied out to host processor memory or to another location.

權利要求

微信群二維碼

意見反饋

白丝美女被狂躁免费视频网站,500av导航大全精品,yw.193.cnc爆乳尤物未满,97se亚洲综合色区,аⅴ天堂中文在线网官网

Synchronization of concurrent computation engines

摘要

說明書

權利要求

該功能需要專業(yè)版企業(yè)版VIP權限，您可以：

該功能需要專業(yè)版企業(yè)版VIP權限，您可以：