白丝美女被狂躁免费视频网站,500av导航大全精品,yw.193.cnc爆乳尤物未满,97se亚洲综合色区,аⅴ天堂中文在线网官网

Hardware accelerator for executing a computation task

專利號(hào)
US11175957B1
公開日期
2021-11-16
申請(qǐng)人
International Business Machines Corporation(US NY Armonk)
發(fā)明人
Dionysios Diamantopoulos; Florian Michael Scheidegger; Adelmo Cristiano Innocenza Malossi; Christoph Hagleitner; Konstantinos Bekas
IPC分類
G06F9/30; G06F9/50; G06F9/38
技術(shù)領(lǐng)域
bit,may,unit,input,be,units,data,tensor,operands,hardware
地域: NY NY Armonk

摘要

The present disclosure relates to a hardware accelerator for executing a computation task composed of a set of operations. The hardware accelerator comprises a controller and a set of computation units. Each computation unit of the set of computation units is configured to receive input data of an operation of the set of operations and to perform the operation, wherein the input data is represented with a distinct bit length associated with each computation unit. The controller is configured to receive the input data represented with a certain bit length of the bit lengths and to select one of the set of computation units that can deliver a valid result and that is associated with a bit length smaller than or equal to the certain bit length.

說明書

A data operation (e.g., of a neural network) may be defined in step 701. The data operation may, for example, be a tensor accelerator function. An empirical model may be created in step 703. The empirical model may define how FPGA resources of step 704 for performing the data operation can scale with different precisions, e.g., it may define how FPGA resources required for performing the data operation using a 6-bit representation can be obtained from FPGA resources used for performing the data operation using a 8-bit representation. As indicated in FIG. 7, step 703 may be optional as the FPGA resources provided in step 704 may be sufficient to generate the bit stream file. A profile of the neural network and input datasets provided in step 705 may be used to determine in step 706 the speculation granularity. The speculation granularity may indicate the bit representations that can be used (in addition to the full 8-bit precision) to perform the data operation, e.g., 2-bit representation and 6-bit representation may be determined in step 706. The automatic generator 700 may receive as input the data operation, the empirical model, the speculation granularity, and FPGA resources in order to generate a bit stream file that can create replication units in the FPGA 710 according to the speculation granularity. The bitstream file may configure the FPGA 710 so that the data operation can be performed in accordance with the present invention. As indicated in FIG. 7, different tools may be used to perform the method of FIG. 7. For example, steps 700, 703, and 705 may be provided with a software tool such as Python. The FPGA resources and the speculation granularity may be described in constraint files such as JSON files. The tensor accelerator function in step 701 may be implemented in a high-level programming language (e.g., C, C++, SystemC, OpenCL, Chisel, Python, etc.), an HDL language (e.g., VHDL, Verilog, etc.), or any form of semiconductor intellectual property core (e.g., soft-core, hard-core, encrypted netlist, etc.)

權(quán)利要求

1
微信群二維碼
意見反饋