Mundodu Krishna, Prasanna Kumar; Ramaswamy, Kumaraswamy

doi:10.1049/iet-spr.2016.0450

Back to matches

Your institution may have access to this item. Find your institution then sign in to continue.

Title: Single Channel speech separation based on empirical mode decomposition and Hilbert Transform.
Authors: Mundodu Krishna, Prasanna Kumar; Ramaswamy, Kumaraswamy
Abstract: In this study, the authors discuss unsupervised separation of two speakers from single microphone recording using empirical mode decomposition (EMD) and Hilbert transform (HT) generally known as Hilbert–Huang transform. A two‐stage separation procedure is proposed for single‐channel (SC) speech separation. Initial stage of separation is done using EMD, HT and instantaneous frequencies. EMD decomposes the mixed signal into oscillatory functions known as intrinsic mode functions (IMFs). Suitable IMFs are selected using successive EMD decomposition and HT is applied to extract the instantaneous frequencies. The speech frames are grouped into two speakers using correlation of instantaneous frequencies between mixed signal and selected IMFs. Second‐stage separation is done by further decomposing the estimated speakers into IMFs and finding the instantaneous amplitudes using HT. A ratio of instantaneous amplitudes of mixed speech and stage 1 recovered speech signal is computed for both speakers. Histogram of the ratio obtained can be used to estimate the ideal binary mask for each speaker. These masks are applied to the speech mixture and the underlying speakers are estimated. The proposed method was compared with the existing unsupervised SC source separation algorithms. The results show significant improvement in objective measures.
Publication: IET Signal Processing (Wiley-Blackwell), 2017, Vol 11, Issue 5, p579
ISSN: 1751-9675
Publication type: Article
DOI: 10.1049/iet-spr.2016.0450

We found a match

Single Channel speech separation based on empirical mode decomposition and Hilbert Transform.

Mundodu Krishna, Prasanna Kumar; Ramaswamy, Kumaraswamy

IET Signal Processing (Wiley-Blackwell), 2017, Vol 11, Issue 5, p579

1751-9675

Article

10.1049/iet-spr.2016.0450