We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
LPST-Det: Local-Perception-Enhanced Swin Transformer for SAR Ship Detection.
- Authors
Yang, Zhigang; Xia, Xiangyu; Liu, Yiming; Wen, Guiwei; Zhang, Wei Emma; Guo, Limin
- Abstract
Convolutional neural networks (CNNs) and transformers have boosted the rapid growth of object detection in synthetic aperture radar (SAR) images. However, it is still a challenging task because SAR images usually have the characteristics of unclear contour, sidelobe interference, speckle noise, multiple scales, complex inshore background, etc. More effective feature extraction by the backbone and augmentation in the neck will bring a promising performance increment. In response, we make full use of the advantage of CNNs in extracting local features and the advantage of transformers in capturing long-range dependencies to propose a Swin Transformer-based detector for arbitrary-oriented SAR ship detection. Firstly, we incorporate a convolution-based local perception unit (CLPU) into the transformer structure to establish a powerful backbone. The local-perception-enhanced Swin Transformer (LP-Swin) backbone combines the local information perception ability of CNNs and the global feature extraction ability of transformers to enhance representation learning, which can extract object features more effectively and boost the detection performance. Then, we devise a cross-scale bidirectional feature pyramid network (CS-BiFPN) by strengthening the propagation and integration of both location and semantic information. It allows for more effective utilization of the feature extracted by the backbone and mitigates the problem of multi-scale ships. Moreover, we design a one-stage framework integrated with LP-Swin, CS-BiFPN, and the detection head of R3Det for arbitrary-oriented object detection, which can provide more precise locations for inclined objects and introduce less background information. On the SAR Ship Detection Dataset (SSDD), ablation studies are implemented to verify the effectiveness of each component, and competing experiments illustrate that our detector attains 93.31% in mean average precision (mAP), which is a comparable detection performance with other advanced detectors.
- Subjects
TRANSFORMER models; OBJECT recognition (Computer vision); CONVOLUTIONAL neural networks; SYNTHETIC aperture radar; SPECKLE interference; RADARSAT satellites
- Publication
Remote Sensing, 2024, Vol 16, Issue 3, p483
- ISSN
2072-4292
- Publication type
Article
- DOI
10.3390/rs16030483