This paper presents a new approach to phoneme recognition system. A modified Time-delay Neural Network (TDNN) based on similarity vectors of clustering node information is developed for this purpose. The speech data have been analyzed first by time varying ARMA-D model to have better response of its time varying characteristics. For the generation of the similarity vectors of the clustering nodes, Self-Organizing Clustering process is used. To study the performance of this system, the speaker-independent recognition of the voiced explosive(stop) consonants /b,d,g/ in varying phonetic contexts is taken as the initial recognition task. This system gives a recognition rate for the stop consonants of about 84.3% for speaker independent speech data. For all these experiments, Japanese speech data is used supplied by ATR, Japan. The time taken for the training and recognition by the system can be considered reasonable.
|ジャーナル||Proceedings - IEEE International Symposium on Circuits and Systems|
|出版ステータス||出版済み - 1995|
|イベント||Proceedings of the 1995 IEEE International Symposium on Circuits and Systems-ISCAS 95. Part 3 (of 3) - Seattle, WA, USA|
継続期間: 4 30 1995 → 5 3 1995
All Science Journal Classification (ASJC) codes