We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
A review of machine transliteration, translation, evaluation metrics and datasets in Indian Languages.
- Authors
Jha, Abhinav; Patil, Hemprasad Yashwant
- Abstract
In today's global scenario, frequent international and domestic interactions necessitate the application of Machine Transliteration and Translation systems to overcome the language barrier. This paper presents a review of Natural Language Processing (NLP) techniques like Machine Translation (MT) and Machine Transliteration (MTn), along with providing an analytical study of evaluation metrics such as BLEU (BiLingual Evaluation Understudy) score and discussing datasets available for MT and MTn systems in Indian languages. This paper is unique in providing a detailed review of all steps involved in the NLP system development pipeline, from the creation and collection of data to the development of the system, and furthermore, the evaluation and analysis of the system. It also comments on the validity and viability of various evaluation metrics for Indian languages. MT and MTn systems are an evolving field of computational linguistics and are considered to be incredibly challenging to develop. The lack of readily available grammatical rules, the distinction between proper and common nouns, and large datasets, along with additional linguistic complexity compared to many other languages, makes developing such systems for Indian languages even more complicated. It explores different approaches like statistics oriented, example oriented, and neural network-oriented MT techniques implied in MT tasks, along with providing insight into the work carried out so far for Indian languages. The review also discusses the scope for future research in this field. This article determines the current status of available datasets, MT and MTn systems, along with commenting on the validity of currently available evaluation metrics like BLEU for Indian languages. The article also provides a direction in which further research for Indian languages should ideally be headed.
- Subjects
NATURAL language processing; MACHINE translating; TRANSLITERATION; LINGUISTIC complexity; COMPUTATIONAL linguistics
- Publication
Multimedia Tools & Applications, 2023, Vol 82, Issue 15, p23509
- ISSN
1380-7501
- Publication type
Article
- DOI
10.1007/s11042-022-14273-1