We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
A Corpus-based Machine Translation Method of Term Extraction in LSP Texts.
- Authors
Wei Huangfu; Yushan Zhao
- Abstract
To tackle the problems of term extraction in language specific field, this paper proposes a method of coordinating use of corpus and machine translation system in extracting terms in LSP text. A comparable corpus built for this research contains 167 English texts and 229 Chinese texts with around 600,000 English tokens and 900,000 Chinese characters. The corpus is annotated with mega-information and tagged with POS for further use. To get the key word list from the corpus, BFSU PowerConc software is used with the referential corpora of Crown and CLOB for English and TORCH and LCMC for Chinese. A VB program is written to generate the multi-word units, and then GOOGLE translators' toolkit is used to get translation pairs and SDL trados fuzzy match function is applied to extract lists of multi-word terms and their translations. The results show this method has 70% of translated term pairs scoring 2.0 in a 0~3 grading scale with a 0.5 interval by human graders. The methods can be applied to extract translation term pairs for computer-aided translation of language for specific purpose texts. Also, the by-product comparable corpus, combined with N-gram multiword unit lists, can be used in facilitating trainee translators in translation. The findings underline the significance of combing the use of machine translation method with corpora techniques, and also foresee the necessity of comparable corpora building and sharing and Conc-gram extracting in this field.
- Subjects
TRANSLATIONS; CORPORA; CHINESE language; COMPARATIVE studies; COMPUTER software; EQUIPMENT &; supplies
- Publication
Theory & Practice in Language Studies (TPLS), 2014, Vol 4, Issue 1, p46
- ISSN
1799-2591
- Publication type
Article
- DOI
10.4304/tpls.4.1.46-51