We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
The position-based compression techniques for DNN model.
- Authors
Tang, Minghua; Russo, Enrico; Palesi, Maurizio
- Abstract
In deep neural network (DNN) accelerators, it is expensive to transfer model parameters from the main memory to the processing elements. Data movement accounts for a large number of the inference latency and energy consumption. In this paper, we present three position-based techniques to compress the DNN model parameters. The techniques could lead to significant energy and performance improvement. The three presented compression techniques are lossless. The first technique takes into consideration the regularly repeat property of the DNN weights to compress them. The second technique saves the relative distance between weights instead of the weights to compress the model. The third technique applies Huffman coding on the relative distance based on the second technique. The proposed techniques are assessed on several DNNs. The results show that, the first technique could decrease 38% of latency and 36% energy, respectively. The second technique could decrease 41% of latency and 39% energy, respectively. The third technique could decrease 45% of latency and 43% energy, respectively. Applying Huffman code could achieve additional 7% reduction in both latency and energy based on the second technique.
- Subjects
ARTIFICIAL neural networks; HUFFMAN codes; ENERGY consumption
- Publication
Journal of Supercomputing, 2023, Vol 79, Issue 15, p17445
- ISSN
0920-8542
- Publication type
Article
- DOI
10.1007/s11227-023-05339-4