Abstract
In this paper, we present a visualization method for detecting peculiar WWW pages given a set of keywords. Detection of peculiar WWW pages is expected to lead to making profits by various means including business and investment. We try to capture peculiarity of a WWW page from several viewpoints at summary levels by using GF (Google Frequency) method to detect rare words and the PLSI (Probabilistic Latent Semantic Indexing) method to find the major topic and the remaining topic. Experimental results show that our visualization method DPITT (Detecting Peculiar pages from Image, Topic and Term) outperforms Google in a problem setting which favors the latter considerably.
Original language | English |
---|---|
Title of host publication | 2006 IEEE International Conference on Granular Computing |
Pages | 538-541 |
Number of pages | 4 |
Publication status | Published - 2006 |
Externally published | Yes |
Event | 2006 IEEE International Conference on Granular Computing - Atlanta, GA, United States Duration: May 10 2006 → May 12 2006 |
Other
Other | 2006 IEEE International Conference on Granular Computing |
---|---|
Country/Territory | United States |
City | Atlanta, GA |
Period | 5/10/06 → 5/12/06 |
All Science Journal Classification (ASJC) codes
- Engineering(all)