丁凯旋; 陈雁翔; 赵鹏铖; 朱玉鹏; 盛振涛

doi:10.3778/j.issn.1002-8331.2102-0060

Back to matches

Your institution may have access to this item. Find your institution then sign in to continue.

Title: 多负例对比机制下的跨模态表示学习.
Authors: 丁凯旋; 陈雁翔; 赵鹏铖; 朱玉鹏; 盛振涛
Abstract: In order to obtain more distinctive cross-modal representations effectively, a cross-modal representation learning method based on the multi-negatives contrastive mechanism--supervised contrastive cross-modal representation learning （SCCMRL）is proposed, and it is applies to the modalities of vision and audio. SCCMRL extracts vision and audio features through vision encoder and audio encoder which uses supervised contrastive loss to compare sample with its multiple negatives. As a result, the audio-visual features that belong to the same category are closer, and the audio-visual features that belong to different categories are more distant. Furthermore, this method also introduces center loss and label loss to ensure the modality consistency and semantic discrimination between cross-modal representations. To verify the effectiveness of the SCCMRL method, this paper constructs a corresponding cross-modal retrieval system, which conducts cross-modal retrieval experiments based on the Sub&#95URMP and XmediaNet datasets. The experimental results show that the SCCMRL method has achieved a higher mAP value than the current cross-modal retrieval methods that are used commonly. It also verifies the feasibility of applying the multi-negatives contrastive mechanism in cross-modal representation learning.
Subjects: AUTOMATED storage retrieval systems; VISION
Publication: Journal of Computer Engineering & Applications, 2022, Vol 58, Issue 19, p184
ISSN: 1002-8331
Publication type: Article
DOI: 10.3778/j.issn.1002-8331.2102-0060

We found a match

多负例对比机制下的跨模态表示学习.

丁凯旋; 陈雁翔; 赵鹏铖; 朱玉鹏; 盛振涛

AUTOMATED storage retrieval systems; VISION

Journal of Computer Engineering & Applications, 2022, Vol 58, Issue 19, p184

1002-8331

Article

10.3778/j.issn.1002-8331.2102-0060