SIMULTANEOUS ESTIMATION OF VOCAL TRACT AND VOICE SOURCE PARAMETERS WITH APPLICATION TO SPEECH SYNTHESIS

Wen Ding, Hideki Kasuya, Shuichi Adachi

研究成果: Paper査読

2 被引用数 (Scopus)

抄録

In order to synthesize natural sounding speech with voice quality variations, we propose a concatenative synthesis method based on stored formant/antiformant templates of vowel-consonant-vowel (VCV) segments and on sophisticated control of voice source parameters. By using the parametric Rosenberg-Klatt (RK) model to generate a voiced source waveform and an autoregressive exogenous (ARX) model to represent voiced speech production process, a new adaptive pitch-synchronous analysis method has been devised to estimate the model parameters from which the templates are semiautomatically created. The Kalman filter algorithm deals with the ARX model identification and a simulated annealing method is used for the nonlinear optimization to estimate the voice source parameters. The method has been tested with synthetic speech sounds by comparing widi some other approaches in terms of the accuracy of estimated parameter values. Preliminary synthesis experiments have shown that natural sounding speech with various voice qualities can be generated with the proposed method by manipulating the voice source parameters.

本文言語English
ページ159-162
ページ数4
出版ステータスPublished - 1994
外部発表はい
イベント3rd International Conference on Spoken Language Processing, ICSLP 1994 - Yokohama, Japan
継続期間: 1994 9月 181994 9月 22

Conference

Conference3rd International Conference on Spoken Language Processing, ICSLP 1994
国/地域Japan
CityYokohama
Period94/9/1894/9/22

ASJC Scopus subject areas

  • 言語および言語学
  • 言語学および言語

フィンガープリント

「SIMULTANEOUS ESTIMATION OF VOCAL TRACT AND VOICE SOURCE PARAMETERS WITH APPLICATION TO SPEECH SYNTHESIS」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル