Your institution may have access to this item. Find your institution then sign in to continue.

Title: Current advances and algorithmic solutions in speech generation.
Authors: Oralbekova, Dina; Mamyrbayev, Orken; Kassymova, Dinara; Othman, Mohamed
Abstract: Currently, Text-to-Speech (TTS) technology, aimed at reproducing a natural human voice from text, is gaining increasing demand in natural language processing. Key criteria for evaluating the quality of synthesized sound include its clarity and naturalness, which largely depend on the accurate modeling of intonations using the acoustic model in the speech generation system. This paper presents fundamental methods such as concatenative and parametric speech synthesis, speech synthesis based on hidden Markov models, and deep learning approaches like end-to-end models for building the acoustic model. The article discusses metrics for evaluating the quality of synthesized voice. Brief overviews of modern text-to-speech architectures, such as WaveNet, Tacotron, and Deep Voice, applying deep learning and demonstrating quality ratings close to professionally recorded speech, are also provided.
Subjects: DEEP learning; NATURAL language processing; SPEECH; SPEECH synthesis; HIDDEN Markov models; ACOUSTIC models
Publication: Vibroengineering Procedia, 2024, Vol 54, Issue 1, p160
ISSN: 2345-0533
Publication type: Article
DOI: 10.21595/vp.2024.23940

We found a match

Current advances and algorithmic solutions in speech generation.