We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Exploiting languages proximity for part-of-speech tagging of three French regional languages.
- Authors
Magistry, Pierre; Ligozat, Anne-Laure; Rosset, Sophie
- Abstract
This paper presents experiments in part-of-speech tagging of low-resource languages. It addresses the case when no labeled data in the targeted language and no parallel corpus are available. We only rely on the proximity of the targeted language to a better-resourced language. We conduct experiments on three French regional languages. We try to exploit this proximity with two main strategies: delexicalization and transposition. The general idea is to learn a model on the (better-resourced) source language, which will then be applied to the (regional) target language. Delexicalization is used to deal with the difference in vocabulary, by creating abstract representations of the data. Transposition consists in modifying the target corpus to be able to use the source models. We compare several methods and propose different strategies to combine them and improve the state-of-the-art of part-of-speech tagging in this difficult scenario.
- Subjects
PICARDY (France); ALSACE (France); FRANCE; FRENCH language; PARTS of speech; TAGS (Metadata); FRENCH dialects; OCCITAN language; LANGUAGE &; languages
- Publication
Language Resources & Evaluation, 2019, Vol 53, Issue 4, p865
- ISSN
1574-020X
- Publication type
Article
- DOI
10.1007/s10579-019-09463-7