Your institution may have access to this item. Find your institution then sign in to continue.

Title: Let's get into it: Using contextualized embeddings as retrieval tools.
Authors: Fonteyn, Lauren
Abstract: This squib briefly explores how contextualized embeddings – which are a type of compressed token-based semantic vectors – can be used as semantic retrieval and annotation tools for corpus-based research into constructions. Focusing on embeddings created by the Bidirectional Encoder Representations from Transformer model, also known as 'BERT', this squib demonstrates how contextualized embeddings can help counter two types of retrieval inefficiency scenarios that may arise with purely form-based corpus queries. In the first scenario, the formal query yields a large number of hits, which contain a reasonable number of relevant examples that can be labeled and used as input for a sense disambiguation classifier. In the second scenario, the contextualized embeddings of exemplary tokens are used to retrieve more relevant examples in a large, unlabeled dataset. As a case study, this squib focuses on the into-interest construction (e.g. I'm so into you).
Subjects: CORPORA; INFORMATION retrieval
Publication: Belgian Journal of Linguistics, 2020, Vol 34, Issue 1, p66
ISSN: 0774-5141
Publication type: Article
DOI: 10.1075/bjl.00035.fon

We found a match

Let's get into it: Using contextualized embeddings as retrieval tools.