Chen, Zhigao; Miao, Xiaoxiao; Xiao, Runqiu; Wang, Wenchao

doi:10.1049/el.2020.0673

Back to matches

Your institution may have rights to this item. Sign in to continue.

Title: Cross-domain speaker recognition using domain adversarial siamese network with a domain discriminator.
Authors: Chen, Zhigao; Miao, Xiaoxiao; Xiao, Runqiu; Wang, Wenchao
Abstract: With the widespread use of automatic speaker recognition in realistic world, it suffers a lot when there is a domain mismatch, including channel, language, distance etc. Recent research studies have introduced the adversarial-learning mechanism into deep neural networks to reduce the distribution mismatch between different domains. However, they only aligned the domain distributions between the background training and evaluation data. Few focused on the diverse distribution underlying the enrol and test data. In this Letter, the authors propose a domain adversarial siamese (DAS) network trying to eliminate the domain influence on speech representation. Specifically, they feed a network with speech pairs from the same speaker. Then a domain discriminator is introduced to capture the domain consistence or difference between pairs. Final embeddings become domain-invariant and more speaker-discriminative. A cross-channel data set is sort out from NIST speaker recognition evaluation and more experiments are conducted on AISHELL-Wake-Up-1 data set. Results show that DAS performs equally effective with typical domain adversarial methods, improving results at least $10\%$10% on energy efficiency rating. Furthermore, it is proved to be more valid for scenarios such as unbalanced data amount and unknown domain, achieving relatively $11\%$11% improvements.
Subjects: NATIONAL Institute of Standards &; Technology (U.S.); ARTIFICIAL neural networks; ENERGY consumption; GAUSSIAN processes; SPEECH synthesis
Publication: Electronics Letters (Wiley-Blackwell), 2020, Vol 56, Issue 14, p737
ISSN: 0013-5194
Publication type: Article
DOI: 10.1049/el.2020.0673

We found a match

Cross-domain speaker recognition using domain adversarial siamese network with a domain discriminator.

Chen, Zhigao; Miao, Xiaoxiao; Xiao, Runqiu; Wang, Wenchao

NATIONAL Institute of Standards &; Technology (U.S.); ARTIFICIAL neural networks; ENERGY consumption; GAUSSIAN processes; SPEECH synthesis

Electronics Letters (Wiley-Blackwell), 2020, Vol 56, Issue 14, p737

0013-5194

Article

10.1049/el.2020.0673