A method of extracting related words using standardized mutual information

Tomohiko Sugimachi, Akira Ishino, Masayuki Takeda, Fumihiro Matsuo

研究成果: Contribution to journalArticle

1 被引用数 (Scopus)

抄録

Techniques of automatic extraction of related words are of great importance in many applications such as query expansion and automatic thesaurus construction. In this paper, a method of extracting related words is proposed basing on the statistical information about the co-occurrences of words from huge corpora. The mutual information is one of such statistical measures and has been used for application mainly in natural language processing. A drawback is, however, the mutual information depends mainly on frequencies of words. To overcome this difficulty, we propose as a new measure a normalize deviation of mutual information. We also reveal a correspondence between word ambiguity and related words using word relation graphs constructed using this measure.

本文言語英語
ページ(範囲)478-485
ページ数8
ジャーナルLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
2843
DOI
出版ステータス出版済み - 2003

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

フィンガープリント 「A method of extracting related words using standardized mutual information」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル