EBSCO Logo
Connecting you to content on EBSCOhost
Results
Title

基于对数处理机制和时频掩蔽估计的语音增强.

Authors

王显云; 窦姗姗; 程楚皓

Abstract

In view of the problem of inaccurate speech estimation by time spectrum model, this study proposes a model transformation method to obtain the logarithmic probability density function of noise and speech. With the help of the logarithmic relationship among noisy speech, clean speech and noise, and the MMSE (Minimum Mean Square Error) estimation theory, a time frequency mask of the estimating log spectrum of speech is derived. A soft mask is also derived based on the logarithmice probability distribution of speech and noise, which can weight the logarithmic subbands of noisy speech to reduce noise and improve the accuracy of speech estimation. The simulation results show that compared with unprocessed noisy speech, the proposed method has an improvement of more than 3 dB in noise suppression. The average improvement in auditory perception of time frequency mask and soft mask based on MMSE is 27.7% and 29.4%, and the average improvement in intelligibility is 12.7% and 14.3% respectively.

Subjects

MEAN square algorithms; DISTRIBUTION (Probability theory); PROBABILITY density function; TIME perception; ESTIMATION theory; INTELLIGIBILITY of speech; SPEECH perception

Publication

Electronic Science & Technology, 2025, Vol 38, Issue 1, p45

ISSN

1007-7820

Publication type

Academic Journal

DOI

10.16180/j.enki.issn1007-7820.2025.01.007

EBSCO Connect | Privacy policy | Terms of use | Copyright | Manage my cookies
Journals | Subjects | Sitemap
© 2025 EBSCO Industries, Inc. All rights reserved