Your institution may have rights to this item. Sign in to continue.

Title: Recovering Word Forms by Context for Morphologically Rich Languages.
Authors: Alekseev, A. M.; Nikolenko, S. I.
Abstract: In this work, we focus on "sentence-level unlemmatization," the task of generating a grammatical sentence given a lemmatized one; this task is usually easy to do for humans but may present problems for machine learning models. We treat this setting as a machine translation problem and, as a first try, apply a sequence-to-sequence model to the texts of Russian Wikipedia articles, evaluate the effect of the different training sets sizes quantitatively and achieve the BLUE score of 67, 3 using the largest training set available. We discuss preliminary results and flaws of traditional machine translation evaluation methods for this task and suggest directions for future research.
Subjects: WIKIPEDIA; MACHINE learning; MACHINE translating; TASK analysis
Publication: Journal of Mathematical Sciences, 2023, Vol 273, Issue 4, p527
ISSN: 1072-3374
Publication type: Article
DOI: 10.1007/s10958-023-06518-7

We found a match

Recovering Word Forms by Context for Morphologically Rich Languages.