In this paper, we present a visualization method for detecting peculiar WWW pages given a set of keywords. Detection of peculiar WWW pages is expected to lead to making profits by various means including business and investment. We try to capture peculiarity of a WWW page from several viewpoints at summary levels by using GF (Google Frequency) method to detect rare words and the PLSI (Probabilistic Latent Semantic Indexing) method to find the major topic and the remaining topic. Experimental results show that our visualization method DPITT (Detecting Peculiar pages from Image, Topic and Term) outperforms Google in a problem setting which favors the latter considerably.
|ホスト出版物のタイトル||2006 IEEE International Conference on Granular Computing|
|出版ステータス||出版済み - 2006|
|イベント||2006 IEEE International Conference on Granular Computing - Atlanta, GA, 米国|
継続期間: 5 10 2006 → 5 12 2006
|その他||2006 IEEE International Conference on Granular Computing|
|Period||5/10/06 → 5/12/06|
All Science Journal Classification (ASJC) codes