白丝美女被狂躁免费视频网站,500av导航大全精品,yw.193.cnc爆乳尤物未满,97se亚洲综合色区,аⅴ天堂中文在线网官网

Hardware accelerator for executing a computation task

專利號(hào)
US11175957B1
公開日期
2021-11-16
申請(qǐng)人
International Business Machines Corporation(US NY Armonk)
發(fā)明人
Dionysios Diamantopoulos; Florian Michael Scheidegger; Adelmo Cristiano Innocenza Malossi; Christoph Hagleitner; Konstantinos Bekas
IPC分類
G06F9/30; G06F9/50; G06F9/38
技術(shù)領(lǐng)域
bit,may,unit,input,be,units,data,tensor,operands,hardware
地域: NY NY Armonk

摘要

The present disclosure relates to a hardware accelerator for executing a computation task composed of a set of operations. The hardware accelerator comprises a controller and a set of computation units. Each computation unit of the set of computation units is configured to receive input data of an operation of the set of operations and to perform the operation, wherein the input data is represented with a distinct bit length associated with each computation unit. The controller is configured to receive the input data represented with a certain bit length of the bit lengths and to select one of the set of computation units that can deliver a valid result and that is associated with a bit length smaller than or equal to the certain bit length.

說(shuō)明書

Each pair of the pairs of units may be connected to a controller 215 (also named speculative precision controller (SPC)). For each pair of the pairs, the controller 215 is configured to receive the input data represented with the highest bit length and to select one of the units of the pair that can deliver a valid result first, wherein the output of the selected unit is provided as a result of the operation. For example, for the pair of units comprising the computation unit 210A and replication unit 212A, the controller 215 is configured to receive the input data of the operation 201A, wherein the received input data are in precision A that corresponds to the highest bit length 8-bit. The controller 215 may select one of the two units 210A and 212A of the pair that can deliver a valid result first/earlier. The replication unit 212A may perform the operation 201A faster than the computation unit 210A because it uses data with less precision. The controller 215 may thus decide whether the result that is provided by the replication unit 212A is a valid result. For that, the controller may, for example, calculate the number of leading zeros on the operands of the input data of the data operation 201A (e.g., as described with reference to FIG. 4). As soon as the controller makes a decision, the results of the respective unit are forwarded, and the remaining speculated execution may be forced to cancel the calculation. The hardware accelerator 202 further comprises an on-chip memory 214, e.g., tensor on chip memory, for storing intermediate processing results and/or providing the necessary data to each computation unit every cycle.

權(quán)利要求

1
微信群二維碼
意見反饋