Speaker normalization based on time-frequency warp with inter-frame consistency

Kei Yamada, Seiichi Uchida, Hiroaki Sakoe

Research output: Contribution to journalArticlepeer-review

Abstract

A new algorithm for speaker-independent spoken word recognition is presented. The algorithm is based on the time-frequency warping technique where frequency axis warping is performed in order to adjust individual spectral difference, in addition to time axis warping. In the conventional algorithm, frequency axis warping is independently determined at each frame (i.e., time). In this case, such warp have a tendency to yield excessive deformations of time-frequency plane, it is feared. In order to suppress such excessive deformations, inter-frame consistency of frequency axis warping is newly taken into account as constraints on the warping. The optimal warping is obtained by using dynamic programming with the constraints. As an implementation technique, beam search based acceleration is also investigated. Experimental results indicates advantageous characteristics of the present algorithm over the conventional algorithm.

Original languageEnglish
Pages (from-to)197-202
Number of pages6
JournalResearch Reports on Information Science and Electrical Engineering of Kyushu University
Volume3
Issue number2
Publication statusPublished - 1998
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Electrical and Electronic Engineering
  • Hardware and Architecture
  • Engineering (miscellaneous)

Fingerprint

Dive into the research topics of 'Speaker normalization based on time-frequency warp with inter-frame consistency'. Together they form a unique fingerprint.

Cite this