Abstract
In this letter, we propose an improved speech/nonspeech classification method to effectively classify a multimedia source. To improve performance, we introduce a feature based on spectral duration analysis, and combine recently proposed features such as high zero crossing rate ratio (HZCRR), low short time energy ratio (LSTER), and pitch ratio (PR). According to the results of our experiments on speech, music, and environmental sounds, the proposed method obtained high classification results when compared with conventional approaches.
Original language | English |
---|---|
Pages (from-to) | 830-832 |
Number of pages | 3 |
Journal | IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences |
Volume | E93-A |
Issue number | 4 |
DOIs | |
Publication status | Published - 2010 Apr |
Keywords
- Audio indexing
- Feature combination
- Spectral duration analysis
- Speech/nonspeech classification
ASJC Scopus subject areas
- Signal Processing
- Computer Graphics and Computer-Aided Design
- Electrical and Electronic Engineering
- Applied Mathematics