抄録
Techniques of automatic extraction of related words are of great importance in many applications such as query expansion and automatic thesaurus construction. In this paper, a method of extracting related words is proposed basing on the statistical information about the co-occurrences of words from huge corpora. The mutual information is one of such statistical measures and has been used for application mainly in natural language processing. A drawback is, however, the mutual information depends mainly on frequencies of words. To overcome this difficulty, we propose as a new measure a normalize deviation of mutual information. We also reveal a correspondence between word ambiguity and related words using word relation graphs constructed using this measure.
本文言語 | 英語 |
---|---|
ページ(範囲) | 478-485 |
ページ数 | 8 |
ジャーナル | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
巻 | 2843 |
DOI | |
出版ステータス | 出版済み - 2003 |
!!!All Science Journal Classification (ASJC) codes
- 理論的コンピュータサイエンス
- コンピュータ サイエンス(全般)