We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Layer-fusion for online mutual knowledge distillation.
- Authors
Hu, Gan; Ji, Yanli; Liang, Xingzhu; Han, Yuexing
- Abstract
Online knowledge distillation opens a door for distillation on parallel student networks, which breaks the heavy reliance upon the pre-trained teacher model. The additional feature fusion solutions further provide positive training loop among parallel student networks. However, current feature fusion operation is always set at the end of sub-networks, thus its capability is limited. In this paper, we propose a novel online knowledge distillation approach by designing multiple layer-level feature fusion modules to connect sub-networks, which contributes to triggering mutual learning among student networks. For model training, fusion modules of middle layers are regarded as auxiliary teachers, while the fusion module at the end of sub-networks is used as the ensemble teacher. Each sub-network is optimized under the supervision of two kinds of knowledge transmitted by different teachers. Furthermore, the attention learning is adopted to enhance feature representation in fusion modules applied to middle layers, which assists to obtain representative features. Extensive evaluations are performed on CIFAR10/CIFAR100, and ImageNet2012 datasets, and experiment results exhibit outstanding performance of our proposed approach.
- Subjects
DISTILLATION; ONLINE education
- Publication
Multimedia Systems, 2023, Vol 29, Issue 2, p787
- ISSN
0942-4962
- Publication type
Article
- DOI
10.1007/s00530-022-01021-6