시스톨릭 행렬 구조에서의 깊이별 분리 합성곱 연산 가속을 위한 다중 스레드와 오버래핑 구조
- Title
- 시스톨릭 행렬 구조에서의 깊이별 분리 합성곱 연산 가속을 위한 다중 스레드와 오버래핑 구조
- Authors
- 이승규
- Date Issued
- 2022
- Publisher
- 포항공과대학교
- Abstract
- A systolic array (SA) is widely utilized in neural network hardware accelerators such as Google TPU and NVIDIA tensor cores. However, the low utilization of processing elements (PEs) during depthwise separable convolution is one of the challenges to increase the throughput of the SA. In this thesis, I propose a multiple data line design on weight-stationary SA for the depthwise separable convolution. The proposed SA adopts additional data lines to eliminate redundant multiplications and maximize the throughput in depthwise convolution. In addition, the proposed SA utilizes idle PEs excluded from the computation of the depthwise convolution to simultaneously compute the following pointwise convolution. After the computation of the depthwise convolution is completed, new idle PEs are created: I utilize these PEs to boost the computation of the remaining pointwise convolution. Consequently, the proposed 128×128 sized SA achieves a speed-up of 4.05× and 1.75× and reduces the energy consumption by 66.7% and 25.4%, compared to the basic SA and RiSA in MobileNetV3, respectively.
- URI
- http://postech.dcollection.net/common/orgView/200000638262
https://oasis.postech.ac.kr/handle/2014.oak/117423
- Article Type
- Thesis
- Files in This Item:
- There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.