Your institution may have access to this item. Find your institution then sign in to continue.

Title: Estonian Text-to-Speech Synthesis with Non-autoregressive Transformers.
Authors: RÄTSEP, Liisa; LELLEP, Rasmus; FISHEL, Mark
Abstract: While text-to-speech synthesis with non-autoregressive Transformers has achieved state-of-the-art quality for many languages, the methodology of Estonian text-to-speech synthesis has not been revised for neural methods. This paper evaluates the quality of Estonian text-tospeech with Transformer-based models using different language-specific data processing steps. Additionally, we conduct a human evaluation to show how well these models can learn the patterns of Estonian pronunciation, given different amounts of training data and varying degrees of phonetic information. Our error analysis shows that using a simple multi-speaker approach can significantly decrease the number of pronunciation errors, while some information can also be helpful to a smaller extent.
Subjects: SPEECH synthesis; PRONUNCIATION
Publication: Baltic Journal of Modern Computing, 2022, Vol 10, Issue 3, p447
ISSN: 2255-8942
Publication type: Article
DOI: 10.22364/bjmc.2022.10.3.17

We found a match

Estonian Text-to-Speech Synthesis with Non-autoregressive Transformers.