An improved speech/nonspeech classification based on feature combination for audio indexing

Ji Soo Keum, Hyon Soo Lee, Masafumi Hagiwara

Research output: Contribution to journalArticle

Abstract

In this letter, we propose an improved speech/nonspeech classification method to effectively classify a multimedia source. To improve performance, we introduce a feature based on spectral duration analysis, and combine recently proposed features such as high zero crossing rate ratio (HZCRR), low short time energy ratio (LSTER), and pitch ratio (PR). According to the results of our experiments on speech, music, and environmental sounds, the proposed method obtained high classification results when compared with conventional approaches.

Original languageEnglish
Pages (from-to)830-832
Number of pages3
JournalIEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
VolumeE93-A
Issue number4
DOIs
Publication statusPublished - 2010 Apr

Fingerprint

Indexing
Zero-crossing
Acoustic waves
Music
Multimedia
Classify
Experiments
Energy
Experiment
Speech
Sound

Keywords

  • Audio indexing
  • Feature combination
  • Spectral duration analysis
  • Speech/nonspeech classification

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Graphics and Computer-Aided Design
  • Applied Mathematics
  • Signal Processing

Cite this

@article{956b68801374459bad511027a4228a16,
title = "An improved speech/nonspeech classification based on feature combination for audio indexing",
abstract = "In this letter, we propose an improved speech/nonspeech classification method to effectively classify a multimedia source. To improve performance, we introduce a feature based on spectral duration analysis, and combine recently proposed features such as high zero crossing rate ratio (HZCRR), low short time energy ratio (LSTER), and pitch ratio (PR). According to the results of our experiments on speech, music, and environmental sounds, the proposed method obtained high classification results when compared with conventional approaches.",
keywords = "Audio indexing, Feature combination, Spectral duration analysis, Speech/nonspeech classification",
author = "Keum, {Ji Soo} and Lee, {Hyon Soo} and Masafumi Hagiwara",
year = "2010",
month = "4",
doi = "10.1587/transfun.E93.A.830",
language = "English",
volume = "E93-A",
pages = "830--832",
journal = "IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences",
issn = "0916-8508",
publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",
number = "4",

}

TY - JOUR

T1 - An improved speech/nonspeech classification based on feature combination for audio indexing

AU - Keum, Ji Soo

AU - Lee, Hyon Soo

AU - Hagiwara, Masafumi

PY - 2010/4

Y1 - 2010/4

N2 - In this letter, we propose an improved speech/nonspeech classification method to effectively classify a multimedia source. To improve performance, we introduce a feature based on spectral duration analysis, and combine recently proposed features such as high zero crossing rate ratio (HZCRR), low short time energy ratio (LSTER), and pitch ratio (PR). According to the results of our experiments on speech, music, and environmental sounds, the proposed method obtained high classification results when compared with conventional approaches.

AB - In this letter, we propose an improved speech/nonspeech classification method to effectively classify a multimedia source. To improve performance, we introduce a feature based on spectral duration analysis, and combine recently proposed features such as high zero crossing rate ratio (HZCRR), low short time energy ratio (LSTER), and pitch ratio (PR). According to the results of our experiments on speech, music, and environmental sounds, the proposed method obtained high classification results when compared with conventional approaches.

KW - Audio indexing

KW - Feature combination

KW - Spectral duration analysis

KW - Speech/nonspeech classification

UR - http://www.scopus.com/inward/record.url?scp=77950796943&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77950796943&partnerID=8YFLogxK

U2 - 10.1587/transfun.E93.A.830

DO - 10.1587/transfun.E93.A.830

M3 - Article

AN - SCOPUS:77950796943

VL - E93-A

SP - 830

EP - 832

JO - IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

JF - IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

SN - 0916-8508

IS - 4

ER -