We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
On Block g -Circulant Matrices with Discrete Cosine and Sine Transforms for Transformer-Based Translation Machine.
- Authors
Asriani, Euis; Muchtadi-Alamsyah, Intan; Purwarianti, Ayu
- Abstract
Transformer has emerged as one of the modern neural networks that has been applied in numerous applications. However, transformers' large and deep architecture makes them computationally and memory-intensive. In this paper, we propose the block g-circulant matrices to replace the dense weight matrices in the feedforward layers of the transformer and leverage the DCT-DST algorithm to multiply these matrices with the input vector. Our test using Portuguese-English datasets shows that the suggested method improves model memory efficiency compared to the dense transformer but at the cost of a slight drop in accuracy. We found that the model Dense-block 1-circulant DCT-DST of 128 dimensions achieved the highest model memory efficiency at 22.14%. We further show that the same model achieved a BLEU score of 26.47%.
- Subjects
MACHINE translating; DISCRETE cosine transforms; TRANSFORMER models; POWER transformers; MATRICES (Mathematics); KRONECKER products
- Publication
Mathematics (2227-7390), 2024, Vol 12, Issue 11, p1697
- ISSN
2227-7390
- Publication type
Article
- DOI
10.3390/math12111697