We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
An effective negative sampling approach for contrastive learning of sentence embedding.
- Authors
Tan, Qitao; Song, Xiaoying; Ye, Guanghui; Wu, Chuan
- Abstract
Unsupervised sentence embedding learning is a fundamental task in natural language processing. Recently, unsupervised contrastive learning based on pre-trained language models has shown impressive performance in sentence embedding learning. This method aims to align positive sentence pairs while pushing apart negative sentence pairs to achieve semantic uniformity in the representation space. However, most previous literature leverages a random strategy to sample negative pairs, which suffers from the risk of selecting uninformative negative examples (e.g., easily distinguishable examples, anisotropic representations), thus greatly affecting the quality of learned representations. To address this issue, we propose nmCSE, a negative mining contrastive learning method for sentence embedding. Specifically, we introduce distance moderation and spatial uniformity as two properties of informative negative examples, and devise distance-based weighting and grid sampling as two strategies to preserve these properties, respectively. Our proposal outperforms the random strategy across seven semantic textual similarity datasets. Furthermore, our method can easily be adapted to other contrastive learning scenarios (e.g., vision), and does not introduce significant computational overhead.
- Subjects
LANGUAGE models; NATURAL language processing
- Publication
Machine Learning, 2023, Vol 112, Issue 12, p4837
- ISSN
0885-6125
- Publication type
Article
- DOI
10.1007/s10994-023-06408-8