We found a match
Your institution may have rights to this item. Sign in to continue.
- Title
Acoustic event diarization in TV/movie audios using deep embedding and integer linear programming.
- Authors
Li, Yanxiong; Zhang, Yuhan; Li, Xianku; Liu, Mingle; Wang, Wucheng; Yang, Jichen
- Abstract
In this study, we propose a method for acoustic event diarization based on a feature of deep embedding and a clustering algorithm of integer linear programming. The deep embedding learned by deep auto-encoder network is used to represent the properties of different classes of acoustic events, and then the integer linear programming is adopted for merging audio segments belonging to the same class of acoustic events. Four kinds of TV/movie audios (21.5 h in total) are used as experimental data, including Sport, Situation comedy, Award ceremony, and Action movie. We compare the deep embedding with state-of-the-art features. Further, the clustering algorithm of integer linear programming is compared with other clustering algorithms adopted in previous works. Finally, the proposed method is compared to both supervised and unsupervised methods on four kinds of TV/movie audios. The results show that the proposed method is superior to other unsupervised methods based on agglomerative information bottleneck, Bayesian information criterion and spectral clustering, and is little inferior to the supervised method based on deep neural network in terms of acoustic event error.
- Subjects
INTEGER programming; ACTION &; adventure films; EMBEDDINGS (Mathematics); MOTION pictures; DEEP learning; LINEAR programming; MIXED integer linear programming
- Publication
Multimedia Tools & Applications, 2019, Vol 78, Issue 23, p33999
- ISSN
1380-7501
- Publication type
Article
- DOI
10.1007/s11042-019-07991-6