We found a match
Your institution may have rights to this item. Sign in to continue.
- Title
A Multi-Resolution Approach to GAN-Based Speech Enhancement.
- Authors
Kim, Hyung Yong; Yoon, Ji Won; Cheon, Sung Jun; Kang, Woo Hyun; Kim, Nam Soo
- Abstract
Recently, generative adversarial networks (GANs) have been successfully applied to speech enhancement. However, there still remain two issues that need to be addressed: (1) GAN-based training is typically unstable due to its non-convex property, and (2) most of the conventional methods do not fully take advantage of the speech characteristics, which could result in a sub-optimal solution. In order to deal with these problems, we propose a progressive generator that can handle the speech in a multi-resolution fashion. Additionally, we propose a multi-scale discriminator that discriminates the real and generated speech at various sampling rates to stabilize GAN training. The proposed structure was compared with the conventional GAN-based speech enhancement algorithms using the VoiceBank-DEMAND dataset. Experimental results showed that the proposed approach can make the training faster and more stable, which improves the performance on various metrics for speech enhancement.
- Subjects
SPEECH enhancement; GENERATIVE adversarial networks; CONVOLUTIONAL neural networks; KEY performance indicators (Management)
- Publication
Applied Sciences (2076-3417), 2021, Vol 11, Issue 2, p721
- ISSN
2076-3417
- Publication type
Article
- DOI
10.3390/app11020721