Kaka, Jhansi Rani; Kangala, Vijay Kumar

doi:10.22266/ijies2024.0630.59

Back to matches

Your institution may have rights to this item. Sign in to continue.

Title: A Hybrid DMFCC-LPC Based Feature Extraction with DCNN Clustering for Speaker Diarization.
Authors: Kaka, Jhansi Rani; Kangala, Vijay Kumar
Abstract: Speaker Diarization (SD) or speaker indexing is a procedure for automatically partitioning a conversation by the number of speakers into homogeneous segments. The trustworthy diarization method accurately estimates the variable length assertion, and it involves major steps such as speech detection, speaker merges, and speaker change. The major problem with the SD method is enhancing the readability of speech transcription. In this study, a hybrid method of feature extraction based on the Dynamic Mel Frequency Cepstral Coefficient (DMFCC) and Linear Prediction Coding (LPC) is proposed for SD. The Voice Activity Detection (VAD) method is utilized to detect the presence or absence of a speaker in the audio lecture, which is followed by speaker segmentation utilizing the extracted features. The Deep Convolutional Neural Network (DCNN) is utilized to determine the feature vector and cluster the speaker from the audio lecture. The results show that the proposed DMFCC-LPC delivers a robust performance on metrics such as accuracy, Diarization Error Rate (DER) and False Positive Rate (FPR) of 0.967, 0.31 and 0.119 on CALLHOME dataset, in contrast to the Speaker Siarization System using HXLP-DCNN with Sailfish Optimization Algorithm (SDS-HXLP-DCNN-SOA), Feature-Level fusion, and Self-supervised clustering with Path Integral Clustering (SSC-PIC).
Subjects: CONVOLUTIONAL neural networks; OPTIMIZATION algorithms; LINEAR codes; FEATURE extraction; TRANSCRIPTION (Linguistics); PATH integrals
Publication: International Journal of Intelligent Engineering & Systems, 2024, Vol 17, Issue 3, p757
ISSN: 2185-310X
Publication type: Article
DOI: 10.22266/ijies2024.0630.59

We found a match

A Hybrid DMFCC-LPC Based Feature Extraction with DCNN Clustering for Speaker Diarization.

Kaka, Jhansi Rani; Kangala, Vijay Kumar

CONVOLUTIONAL neural networks; OPTIMIZATION algorithms; LINEAR codes; FEATURE extraction; TRANSCRIPTION (Linguistics); PATH integrals

International Journal of Intelligent Engineering & Systems, 2024, Vol 17, Issue 3, p757

2185-310X

Article

10.22266/ijies2024.0630.59