Building crawler for user-specific web search engines

Shingo Takasago, Ryuzo Hasegawa, Hiroshi Fujita, Miyuki Koshimura

Research output: Contribution to journalArticle

Abstract

Today, the volume of available data on the WWW becomes very huge, and searching information from the WWW is a difficult task for a novice user even if he/she uses the standard search engines. One solution to the problem is to build a user-specific search engine, the database of which includes a large number of web documents required for a user. In this paper, we present a method of building a crawler aiming to search the subset of the WWW related to on-topic pages. We show an effective strategy for leading the crawler to on-topic pages by using naive Bayes text classifier trained by an evaluation of pages gathered by the crawler.

Original languageEnglish
Pages (from-to)25-29
Number of pages5
JournalResearch Reports on Information Science and Electrical Engineering of Kyushu University
Volume9
Issue number1
Publication statusPublished - Mar 2004

    Fingerprint

All Science Journal Classification (ASJC) codes

  • Electrical and Electronic Engineering
  • Hardware and Architecture
  • Engineering (miscellaneous)

Cite this