Simultaneous estimation of vocal tract and voice source parameters based on an ARX model

Wen Ding, Hideki Kasuya, Shuichi Adachi

研究成果: Article査読

30 被引用数 (Scopus)


A novel adaptive pitch-synchronous analysis method is proposed to estimate simultaneously vocal tract (formant/antiformant) and voice source parameters from speech waveforms. We use the parametric Rosenberg-Klatt (RK) model to generate a glottal waveform and an autoregressive-exogenous (ARX) model to represent voiced speech production process. The Kalman filter algorithm is used to estimate the formant/antiformant parameters from the coefficients of the ARX model, and the simulated annealing method is employed as a nonlinear optimization approach to estimate the voice source parameters. The two approaches work together in a system identification procedure to find the best set of the parameters of both the models. The new method has been compared using synthetic speech with some other approaches in terms of accuracy of estimated parameter values and has been proved to be superior. We also show that the proposed method can estimate accurately the parameters from natural speech sounds. A major application of the analysis method lies in a concatenative formant synthesizer which allows us to make flexible control of voice quality of synthetic speech.

ジャーナルIEICE Transactions on Information and Systems
出版ステータスPublished - 1995 6

ASJC Scopus subject areas

  • ソフトウェア
  • ハードウェアとアーキテクチャ
  • コンピュータ ビジョンおよびパターン認識
  • 電子工学および電気工学
  • 人工知能


「Simultaneous estimation of vocal tract and voice source parameters based on an ARX model」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。