We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Estonian Text-to-Speech Synthesis with Non-autoregressive Transformers.
- Authors
RÄTSEP, Liisa; LELLEP, Rasmus; FISHEL, Mark
- Abstract
While text-to-speech synthesis with non-autoregressive Transformers has achieved state-of-the-art quality for many languages, the methodology of Estonian text-to-speech synthesis has not been revised for neural methods. This paper evaluates the quality of Estonian text-tospeech with Transformer-based models using different language-specific data processing steps. Additionally, we conduct a human evaluation to show how well these models can learn the patterns of Estonian pronunciation, given different amounts of training data and varying degrees of phonetic information. Our error analysis shows that using a simple multi-speaker approach can significantly decrease the number of pronunciation errors, while some information can also be helpful to a smaller extent.
- Subjects
SPEECH synthesis; PRONUNCIATION
- Publication
Baltic Journal of Modern Computing, 2022, Vol 10, Issue 3, p447
- ISSN
2255-8942
- Publication type
Article
- DOI
10.22364/bjmc.2022.10.3.17