We found a match
Your institution may have rights to this item. Sign in to continue.
- Title
Low‐latency transformer model for streaming automatic speech recognition.
- Authors
Miao, Haoran; Cheng, Gaofeng; Zhang, Pengyuan
- Abstract
Transformer models have made great progress in automatic speech recognition. However, it is challenging for streaming transformer models to make trade‐off between output latency and recognition accuracy. In this letter, it is aimed to propose a low‐latency transformer model with satisfactory recognition accuracy. First, a streaming transformer is designed and explain how it works streamingly. Second, the authors propose to use CTC during training to minimise the latency of transformer models. Finally, the authors also propose to utilise CTC as a backup during decoding to ensure that the low‐latency characteristic is maintained. The authors fairly compare our streaming transformer model to existing streaming models, particularly the transducer model, which is a popular low‐latency approach. The experiments show that, while having comparable output latency, the transformer model outperforms the transducer model by average relative character (or word) error rate reduction of 22.18%, 26.71% and 19.36% on HKUST, Switchboard and Call Home, respectively.
- Subjects
AUTOMATIC speech recognition; TRANSDUCERS; PATTERN recognition systems; ELECTROMECHANICAL devices; ELECTRONIC equipment
- Publication
Electronics Letters (Wiley-Blackwell), 2022, Vol 58, Issue 1, p44
- ISSN
0013-5194
- Publication type
Article
- DOI
10.1049/ell2.12349