We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Indonesian graphemic syllabification using a nearest neighbour classifier and recovery procedure.
- Authors
Parande, Edwina Anky; Suyanto, Suyanto
- Abstract
An automatic syllabification, decomposing a word into syllables, is an important part in an automatic speech recognition (ASR) that uses both syllable-based acoustic and language models. It can be performed to either phoneme or grapheme sequences. The phonemic syllabification is more complex than the other since it requires a grapheme-to-phoneme conversion (G2P) as a previous process. It generally gives a high accuracy for many formal words but its accuracy may decrease for person-names. In contrast, the graphemic syllabification is simpler and more potential to be applied for person-names. This research focuses on developing a model of graphemic syllabification using a combination of phonotactic rules and Fuzzy k-nearest neighbour in every Class (FkNNC). The phonotactic rules are designed to find some deterministic syllabification points while FkNNC, as a statistical classifier, is expected to search the remaining stochastic syllabification points. A recovery procedure is proposed to correct the wrong syllabification points produced by FkNNC. Fivefold cross-validating on a dataset of 50k formal words, selected from the great dictionary of the Indonesian language, shows that the proposed model gives syllable error rate (SER) of 2.48% and the proposed recovery procedure reduces the SER to be 2.27%, which is higher than that produced by the phonemic syllabification (only 0.99%). But, this model is capable of handling a dataset of 15k high variance person-names with SER of 7.45% and the proposed recovery procedure reduces the SER to be 6.78%.
- Subjects
SYLLABICATION; AUTOMATIC speech recognition; PHONOTACTICS; INDONESIAN language; GRAPHEMICS; ERROR rates
- Publication
International Journal of Speech Technology, 2019, Vol 22, Issue 1, p13
- ISSN
1381-2416
- Publication type
Article
- DOI
10.1007/s10772-018-09569-3