Chen, Lianggangxu; Cai, Yiqing; Lu, Changhong; Wang, Changbo; He, Gaoqi

doi:10.1007/s11042-023-14640-6

Back to matches

Your institution may have access to this item. Find your institution then sign in to continue.

Title: Video-based spatio-temporal scene graph generation with efficient self-supervision tasks.
Authors: Chen, Lianggangxu; Cai, Yiqing; Lu, Changhong; Wang, Changbo; He, Gaoqi
Abstract: Spatio-temporal Scene Graphs Generation (STSGG) aims to extract a sequence of graph-based semantic representations for high-level visual tasks. Existing works often fail to exploit the strong temporal correlation and the details of local features, which leads to the inability to distinguish the action between dynamic relation (e.g., drinking) and static relation (e.g., holding). Furthermore, due to bad long-tailed bias, the prediction results are troubled by inaccurate tail predicates classifications. To address these issues, a slowfast local-aware attention (SFLA) Network is proposed for temporal modeling in STSGG. First, a two-branch network is used to extract static and dynamic relation features respectively. Second, a local relation-aware attention (LRA) module is proposed to attach higher importance to the crucial elements in the local relationship. Third, three novel self-supervision prediction tasks are proposed, that is, spatial location, human attention state, and distance variation. Such self-supervision tasks are trained simultaneously with the main model to alleviate the long-tailed bias problem and enhance feature discrimination. Systematic experiments show that our method achieves state-of-the-art performance in the recently proposed Action Genome (AG) dataset and the popular ImageNet Video dataset.
Publication: Multimedia Tools & Applications, 2023, Vol 82, Issue 25, p38947
ISSN: 1380-7501
Publication type: Article
DOI: 10.1007/s11042-023-14640-6

We found a match

Video-based spatio-temporal scene graph generation with efficient self-supervision tasks.

Chen, Lianggangxu; Cai, Yiqing; Lu, Changhong; Wang, Changbo; He, Gaoqi

Multimedia Tools & Applications, 2023, Vol 82, Issue 25, p38947

1380-7501

Article

10.1007/s11042-023-14640-6