We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Dual-Stream Object Tracking Algorithm Based on Vision Transformer.
- Authors
JIANG Yingjie; SONG Xiaoning
- Abstract
Transformer based object tracking algorithms mainly use Transformer to fuse deep convolution features, ignoring the ability of Transformer in feature extraction and decoding prediction. To mitigate the above problems, a dual-stream object tracking algorithm based on vision Transformer is proposed. Swin Transformer based on attention mechanism is introduced for feature extraction, and global information modeling is performed by shifting windows. The Transformer encoder is used to fully fuse the target features and the search region features, and the decoder is used to learn the location information in the target query. Then, target prediction is performed separately for the dual-stream information in the encoderdecoder. Further weighted fusion at the decision level is used to obtain the final tracking result, and a multi-supervised strategy is used. The proposed algorithm achieves state-of-the-art results on four challenging large-scale tracking datasets, LaSOT, TrackingNet, UAV123 and NFS, reaching area under the curve of success rate of 67.4%, 80.9%, 68.6%, and 66.0%, respectively, demonstrating its strong potential. Furthermore, end-to-end object tracking is enabled with a tracking speed of 42 FPS due to the avoidance of complex post-processing steps.
- Subjects
TRACKING algorithms; OBJECT tracking (Computer vision); FEATURE extraction; INFORMATION modeling; ARTIFICIAL neural networks; VISION; DEEP learning
- Publication
Journal of Computer Engineering & Applications, 2022, Vol 58, Issue 12, p183
- ISSN
1002-8331
- Publication type
Article
- DOI
10.3778/j.issn.1002-8331.2203-0035