Expansion of training texts to generate a topic-dependent language model for meeting speech recognition

Kazushige Egashira, Kazuya Kojima, Masaru Yamashita, Katsuya Yamauchi, Shoichi Matsunaga

研究成果: Chapter in Book/Report/Conference proceedingConference contribution

抄録

This paper proposes expansion methods for training texts (baseline) to generate a topic-dependent language model for more accurate recognition of meeting speech. To prepare a universal language model that can cope with the variety of topics discussed in meetings is very difficult. Our strategy is to generate topic-dependent training texts based on two methods. The first is text collection from web pages using queries that consist of topic-dependent confident terms; these terms were selected from preparatory recognition results based on the TF-IDF (TF; Term Frequency, IDF; Inversed Document Frequency) values of each term. The second technique is text generation using participants' names. Our topic-dependent language model was generated using these new texts and the baseline corpus. The language model generated by the proposed strategy reduced the perplexity by 16.4% and out-of-vocabulary rate by 37.5%, respectively, compared with the language model that used only the baseline corpus. This improvement was confirmed through meeting speech recognition as well.

本文言語英語
ホスト出版物のタイトル2012 Conference Handbook - Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2012
出版ステータス出版済み - 2012
外部発表はい
イベント2012 4th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2012 - Hollywood, CA, 米国
継続期間: 12 3 201212 6 2012

その他

その他2012 4th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2012
国/地域米国
CityHollywood, CA
Period12/3/1212/6/12

All Science Journal Classification (ASJC) codes

  • 情報システム

フィンガープリント

「Expansion of training texts to generate a topic-dependent language model for meeting speech recognition」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル