Tan, Qitao; Song, Xiaoying; Ye, Guanghui; Wu, Chuan

doi:10.1007/s10994-023-06408-8

Back to matches

Your institution may have access to this item. Find your institution then sign in to continue.

Title: An effective negative sampling approach for contrastive learning of sentence embedding.
Authors: Tan, Qitao; Song, Xiaoying; Ye, Guanghui; Wu, Chuan
Abstract: Unsupervised sentence embedding learning is a fundamental task in natural language processing. Recently, unsupervised contrastive learning based on pre-trained language models has shown impressive performance in sentence embedding learning. This method aims to align positive sentence pairs while pushing apart negative sentence pairs to achieve semantic uniformity in the representation space. However, most previous literature leverages a random strategy to sample negative pairs, which suffers from the risk of selecting uninformative negative examples (e.g., easily distinguishable examples, anisotropic representations), thus greatly affecting the quality of learned representations. To address this issue, we propose nmCSE, a negative mining contrastive learning method for sentence embedding. Specifically, we introduce distance moderation and spatial uniformity as two properties of informative negative examples, and devise distance-based weighting and grid sampling as two strategies to preserve these properties, respectively. Our proposal outperforms the random strategy across seven semantic textual similarity datasets. Furthermore, our method can easily be adapted to other contrastive learning scenarios (e.g., vision), and does not introduce significant computational overhead.
Subjects: LANGUAGE models; NATURAL language processing
Publication: Machine Learning, 2023, Vol 112, Issue 12, p4837
ISSN: 0885-6125
Publication type: Article
DOI: 10.1007/s10994-023-06408-8

We found a match

An effective negative sampling approach for contrastive learning of sentence embedding.

Tan, Qitao; Song, Xiaoying; Ye, Guanghui; Wu, Chuan

LANGUAGE models; NATURAL language processing

Machine Learning, 2023, Vol 112, Issue 12, p4837

0885-6125

Article

10.1007/s10994-023-06408-8