We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Nearest Neighbour-Based Indonesian G2P Conversion.
- Authors
Suyanto; Agus Harjoko
- Abstract
Grapheme-to-phoneme conversion (G2P), also known as letter-to-sound conversion, is an important module in both speech synthesis and speech recognition. The methods of G2P give varying accuracies for different languages although they are designed to be language independent. This paper discusses a new model based on the pseudo nearest neighbour rule (PNNR) for Indonesian G2P. In this model, a partial orthogonal binary code for graphemes, contextual weighting, and neighbourhood weighting are introduced. Testing to 9,604 unseen words shows that the model parameters are easy to be tuned to reach high accuracy. Testing to 123 sentences containing homographs shows that the model could disambiguate homographs if it uses a long graphemic context. Compared to an information gain tree, PNNR gives a slightly higher phoneme error rate, but it could disambiguate homographs.
- Subjects
GRAPHEMICS; PHONEMICS; NEAREST neighbor analysis (Statistics); INDONESIAN language; HOMONYMS; BINARY codes; ALPHABETIC principle (Reading)
- Publication
Telkomnika, 2014, Vol 12, Issue 2, p389
- ISSN
1693-6930
- Publication type
Article
- DOI
10.12928/TELKOMNIKA.v12i2.1945