An improved speech/nonspeech classification based on feature combination for audio indexing

Ji Soo Keum, Hyon Soo Lee, Masafumi Hagiwara

Research output: Contribution to journalArticlepeer-review

Abstract

In this letter, we propose an improved speech/nonspeech classification method to effectively classify a multimedia source. To improve performance, we introduce a feature based on spectral duration analysis, and combine recently proposed features such as high zero crossing rate ratio (HZCRR), low short time energy ratio (LSTER), and pitch ratio (PR). According to the results of our experiments on speech, music, and environmental sounds, the proposed method obtained high classification results when compared with conventional approaches.

Original languageEnglish
Pages (from-to)830-832
Number of pages3
JournalIEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
VolumeE93-A
Issue number4
DOIs
Publication statusPublished - 2010 Apr

Keywords

  • Audio indexing
  • Feature combination
  • Spectral duration analysis
  • Speech/nonspeech classification

ASJC Scopus subject areas

  • Signal Processing
  • Computer Graphics and Computer-Aided Design
  • Electrical and Electronic Engineering
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'An improved speech/nonspeech classification based on feature combination for audio indexing'. Together they form a unique fingerprint.

Cite this