We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Discriminative features based on modified log magnitude spectrum for playback speech detection.
- Authors
Yang, Jichen; Xu, Longting; Ren, Bo; Ji, Yunyun
- Abstract
In order to improve the performance of hand-crafted features to detect playback speech, two discriminative features, constant-Q variance-based octave coefficients and constant-Q mean-based octave coefficients, are proposed for playback speech detection in this work. They rely on our findings that variance-based modified log magnitude spectrum and mean-based modified log magnitude spectrum can enhance the discriminative power between genuine speech and playback speech. Then constant-Q variance-based octave coefficients (constant-Q mean-based octave coefficients) can be obtained by combining variance-based modified log magnitude spectrum (mean-based modified log magnitude spectrum), octave segmentation, and discrete cosine transform. Finally, constant-Q variance-based octave coefficients and constant-Q mean-based octave coefficients are evaluated on ASVspoof 2017 corpus version 2.0 and ASVspoof 2019 physical access, respectively. Experimental results show that variance-based modified log magnitude spectrum and mean-based modified log magnitude spectrum can produce discriminative features toward playback speech. Further results on the two databases show that constant-Q variance-based octave coefficients and constant-Q mean-based octave coefficients can perform better than some common features, such as mel frequency cepstral coefficients and constant-Q cepstral coefficients.
- Subjects
DISCRETE cosine transforms; SPEECH
- Publication
EURASIP Journal on Audio Speech & Music Processing, 2020, Vol 2020, Issue 1, p1
- ISSN
1687-4714
- Publication type
Article
- DOI
10.1186/s13636-020-00173-5