We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Let's get into it: Using contextualized embeddings as retrieval tools.
- Authors
Fonteyn, Lauren
- Abstract
This squib briefly explores how contextualized embeddings – which are a type of compressed token-based semantic vectors – can be used as semantic retrieval and annotation tools for corpus-based research into constructions. Focusing on embeddings created by the Bidirectional Encoder Representations from Transformer model, also known as 'BERT', this squib demonstrates how contextualized embeddings can help counter two types of retrieval inefficiency scenarios that may arise with purely form-based corpus queries. In the first scenario, the formal query yields a large number of hits, which contain a reasonable number of relevant examples that can be labeled and used as input for a sense disambiguation classifier. In the second scenario, the contextualized embeddings of exemplary tokens are used to retrieve more relevant examples in a large, unlabeled dataset. As a case study, this squib focuses on the into-interest construction (e.g. I'm so into you).
- Subjects
CORPORA; INFORMATION retrieval
- Publication
Belgian Journal of Linguistics, 2020, Vol 34, Issue 1, p66
- ISSN
0774-5141
- Publication type
Article
- DOI
10.1075/bjl.00035.fon