We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
The Czech National Corpus: Principles, Design, and Results.
- Authors
Kučera, Karel
- Abstract
This paper describes the general principles, design, and present state of the Czech National Corpus (CNC) project. The corpus has been designed to provide a firm basis for the study of both the contemporary written Czech (a goal well attainable with the present resources) and the Czech language beyond the limits of contemporary written texts (a long‐term commitment including the building of a corpus of spoken Czech and diachronic and dialectal corpora). The work on the CNC project, now in the eighth year of its official existence, has resulted in the completion of SYN2000, a 100‐million‐word corpus of contemporary written Czech, the organization of the cores of spoken, diachronic, and dialectal corpora, and the finding of workable solutions to some general theoretical problems involved in the building of these corpora.
- Subjects
CZECH Republic; HISTORICAL linguistics; CORPORA; LINGUISTIC analysis; CZECH language; WESTERN Slavic languages; ARCHAISMS (Linguistics); DIALECTS; DIALECT literature
- Publication
Literary & Linguistic Computing, 2002, Vol 17, Issue 2, p245
- ISSN
0268-1145
- Publication type
Article
- DOI
10.1093/llc/17.2.245