We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
多负例对比机制下的跨模态表示学习.
- Authors
丁凯旋; 陈雁翔; 赵鹏铖; 朱玉鹏; 盛振涛
- Abstract
In order to obtain more distinctive cross-modal representations effectively, a cross-modal representation learning method based on the multi-negatives contrastive mechanism--supervised contrastive cross-modal representation learning (SCCMRL)is proposed, and it is applies to the modalities of vision and audio. SCCMRL extracts vision and audio features through vision encoder and audio encoder which uses supervised contrastive loss to compare sample with its multiple negatives. As a result, the audio-visual features that belong to the same category are closer, and the audio-visual features that belong to different categories are more distant. Furthermore, this method also introduces center loss and label loss to ensure the modality consistency and semantic discrimination between cross-modal representations. To verify the effectiveness of the SCCMRL method, this paper constructs a corresponding cross-modal retrieval system, which conducts cross-modal retrieval experiments based on the Sub&#95URMP and XmediaNet datasets. The experimental results show that the SCCMRL method has achieved a higher mAP value than the current cross-modal retrieval methods that are used commonly. It also verifies the feasibility of applying the multi-negatives contrastive mechanism in cross-modal representation learning.
- Subjects
AUTOMATED storage retrieval systems; VISION
- Publication
Journal of Computer Engineering & Applications, 2022, Vol 58, Issue 19, p184
- ISSN
1002-8331
- Publication type
Article
- DOI
10.3778/j.issn.1002-8331.2102-0060