Abdusalomov, Akmalbek; Kutlimuratov, Alpamis; Nasimov, Rashid; Taeg Keun Whangbo

doi:10.32604/cmc.2023.044466

Back to matches

Your institution may have access to this item. Find your institution then sign in to continue.

Title: Improved Speech Emotion Recognition Focusing on High-Level Data Representations and Swift Feature Extraction Calculation.
Authors: Abdusalomov, Akmalbek; Kutlimuratov, Alpamis; Nasimov, Rashid; Taeg Keun Whangbo
Abstract: The performance of a speech emotion recognition (SER) system is heavily influenced by the efficacy of its feature extraction techniques. The study was designed to advance the field of SER by optimizing feature extraction techniques, specifically through the incorporation of high-resolution Mel-spectrograms and the expedited calculation ofMel Frequency Cepstral Coefficients (MFCC). This initiative aimed to refine the system’s accuracy by identifying and mitigating the shortcomings commonly found in current approaches. Ultimately, the primary objective was to elevate both the intricacy and effectiveness of our SER model, with a focus on augmenting its proficiency in the accurate identification of emotions in spoken language. The research employed a dual-strategy approach for feature extraction. Firstly, a rapid computation technique forMFCC was implemented and integratedwith a Bi-LSTMlayer to optimize the encoding ofMFCC features. Secondly, a pretrained ResNet model was utilized in conjunction with feature Stats pooling and dense layers for the effective encoding of Mel-spectrogram attributes. These two sets of features underwent separate processing before being combined in aConvolutionalNeuralNetwork (CNN)outfitted with a dense layer, with the aim of enhancing their representational richness. The model was rigorously evaluated using two prominent databases: CMU-MOSEI and RAVDESS. Notable findings include an accuracy rate of 93.2% on the CMU-MOSEI database and 95.3% on the RAVDESS database. Such exceptional performance underscores the efficacy of this innovative approach,which not only meets but also exceeds the accuracy benchmarks established by traditional models in the field of speech emotion recognition.
Subjects: EMOTION recognition; EXTRACTION techniques; DATABASES; ORAL communication
Publication: Computers, Materials & Continua, 2023, Vol 77, Issue 3, p2915
ISSN: 1546-2218
Publication type: Article
DOI: 10.32604/cmc.2023.044466

We found a match

Improved Speech Emotion Recognition Focusing on High-Level Data Representations and Swift Feature Extraction Calculation.

Abdusalomov, Akmalbek; Kutlimuratov, Alpamis; Nasimov, Rashid; Taeg Keun Whangbo

EMOTION recognition; EXTRACTION techniques; DATABASES; ORAL communication

Computers, Materials & Continua, 2023, Vol 77, Issue 3, p2915

1546-2218

Article

10.32604/cmc.2023.044466