We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Estimating the Amount of Lithuanian Text Indexed by Global Search Engines.
- Authors
DADURKEVICIUS, Virginijus; UTKA, Andrius
- Abstract
The aim of the paper is the estimate of the amount of words in Lithuanian texts indexed by the selected Global Search Engines (GSE), namely Google (by Alphabet Inc.), Bing (by Microsoft Corporation), and Yandex (by ООО «Яндекс», Russia). For this purpose, a special list of 100 rare Lithuanian words (pivot words) with specific characteristics was compiled. Low frequency of pivot words is crucial to consider the count of document matches reported by GSE as an indicator of the word count. Statistical analysis has shown the following amounts of Lithuanian words as of April 2022: 56 billion words by Google, 29 billion words by Bing and 41 billion words by Yandex. Comparative results for neighbouring Belarusian (~0.31xLT), Estonian (~1.45xLT), Finnish (~2.4xLT), Latvian (~0.95xLT), Polish (~11xLT), and Russian (~49xLT) languages have also been assessed.
- Subjects
SEARCH engines; GOOGLE Inc.; LITHUANIANS; WORD frequency; LITHUANIAN language
- Publication
Baltic Journal of Modern Computing, 2022, Vol 10, Issue 3, p326
- ISSN
2255-8942
- Publication type
Article
- DOI
10.22364/bjmc.2022.10.3.06