We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Speaker Recognition Algorithm Based on Fca-Res2Net.
- Authors
Zhangfang Hu; Caiyun Lv; Changbo Wu
- Abstract
Traditional recognition methods often lead to problems such as speaker information loss and reduced recognition rates. To address these problems, an Fca-Res2Net speaker recognition model incorporating a self-attentive mechanism is proposed in this paper. First, the model uses the modified mel-frequency cepstral coefficients (MFCCs) as the system feature input and combines the inverse mel-frequency cepstral coefficients (IMFCCs) with the MFCCs as the base input features to extract more representative speech spectral features. On this basis, the difference parameters △MFCC and △IMFCC are fused to fully extract the speech dynamic and static features in the high- and low-frequency bands. Second, frequency channel attention networks (FcaNets) are introduced on top of the baseline model (Res2Net: a new multiscale backbone architecture), and the residual module is used to fuse the shallow and deep speaker features to better obtain the different feature channel weights without increasing the number of parameters. In addition, to better introduce temporal information and capture long-span speech features, the self-attention mechanism is integrated to enhance the long-span modelling of speech features. Finally, the classification output results are identified. Experimental results show that the proposed model improves the recognition rate and robustness of speakers in long speech when compared with the current mainstream speaker recognition methods in the VoxCeleb dataset with sufficient data volume.
- Subjects
ALGORITHMS; SPEECH
- Publication
IAENG International Journal of Computer Science, 2023, Vol 50, Issue 4, p1319
- ISSN
1819-656X
- Publication type
Article