We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Tri-integrated convolutional neural network for audio image classification using Mel-frequency spectrograms.
- Authors
Khurana, Aayush; Mittal, Sweta; Kumar, Deepika; Gupta, Sonali; Gupta, Ayushi
- Abstract
Emotion is a state which encompasses a variety of physiological phenomena. Classification of emotions has many applications in fields like customer review, product evaluation, national security, etc., thus making it a prominent area of research. The state-of-art methodologies have used either text or audio files to classify emotions which is in contrast to the proposed work which utilizes the Mel-frequency spectrograms. An integrated methodology TiCNN (Tri integrated Convolutional Neural Network) has been proposed for classifying emotions into eight different classes. Three models namely VGG16, VGG19, and a proposed CNN architecture have been integrated and trained on the RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song) dataset. The proposed integrated TiCNN approach classifies emotions into eight different classes with an accuracy of 93.27%. Precision, recall and F1-Score of 0.93, 0.92 and 0.92 have also been used as metrics to evaluate the performance of the proposed model. Further, for model validation, the efficiency and efficacy of the proposed methodology have been compared and analysed with the EMO-DB (Berlin Database of Emotional Speech) dataset. The proposed TiCNN model gives an accuracy of 92.78% on the EMO-DB dataset. Empirical evaluation of the proposed methodology has been compared with conventional transfer learning models and state-of-the-art methodologies, where it has shown its superiority over others.
- Subjects
BERLIN (Germany); CONVOLUTIONAL neural networks; INFORMATION technology security; SPECTROGRAMS; TEXT files; DATABASES; EVALUATION methodology
- Publication
Multimedia Tools & Applications, 2023, Vol 82, Issue 4, p5521
- ISSN
1380-7501
- Publication type
Article
- DOI
10.1007/s11042-022-13358-1