Sun, Peng; Zhang, Wenhu; Li, Songyuan; Guo, Yilin; Song, Congli; Li, Xi

doi:10.1007/s11263-022-01646-0

Back to matches

Your institution may have access to this item. Find your institution then sign in to continue.

Title: Learnable Depth-Sensitive Attention for Deep RGB-D Saliency Detection with Multi-modal Fusion Architecture Search.
Authors: Sun, Peng; Zhang, Wenhu; Li, Songyuan; Guo, Yilin; Song, Congli; Li, Xi
Abstract: RGB-D salient object detection (SOD) is usually formulated as a problem of classification or regression over two modalities, i.e. , RGB and depth. Hence, effective RGB-D feature modeling and multi-modal feature fusion both play a vital role in RGB-D SOD. In this paper, we propose a depth-sensitive RGB feature modeling scheme using the depth-wise geometric prior of salient objects. In principle, the feature modeling scheme is carried out in a Depth-Sensitive Attention Module (DSAM), which leads to the RGB feature enhancement as well as the background distraction reduction by capturing the depth geometry prior. Furthermore, we extend and enhance the original DSAM to DSAMv2 by proposing a novel Depth Attention Generation Module (DAGM) to generate learnable depth attention maps for more robust depth-sensitive RGB feature extraction. Moreover, to perform effective multi-modal feature fusion, we further present an automatic neural architecture search approach for RGB-D SOD, which does well in finding out a feasible architecture from our specially designed multi-modal multi-scale search space. Extensive experiments on nine standard benchmarks have demonstrated the effectiveness of the proposed approach against the state-of-the-art. We name the enhanced learnable Depth-Sensitive Attention and Automatic multi-modal Fusion framework DSA 2 Fv2.
Subjects: ATTENTION; FEATURE extraction; DISTRACTION
Publication: International Journal of Computer Vision, 2022, Vol 130, Issue 11, p2822
ISSN: 0920-5691
Publication type: Article
DOI: 10.1007/s11263-022-01646-0

We found a match

Learnable Depth-Sensitive Attention for Deep RGB-D Saliency Detection with Multi-modal Fusion Architecture Search.

Sun, Peng; Zhang, Wenhu; Li, Songyuan; Guo, Yilin; Song, Congli; Li, Xi

ATTENTION; FEATURE extraction; DISTRACTION

International Journal of Computer Vision, 2022, Vol 130, Issue 11, p2822

0920-5691

Article

10.1007/s11263-022-01646-0