A web page segmentation approach using visual semantics

Jun Zeng, Brendan Flanagan, Sachio Hirokawa, Eisuke Ito

    Research output: Contribution to journalArticlepeer-review

    3 Citations (Scopus)

    Abstract

    Web page segmentation has a variety of benefits and potential web applications. Early techniques of web page segmentation are mainly based on machine learning algorithms and rule-based heuristics, which cannot be used for large-scale page segmentation. In this paper, we propose a formulated page segmentation method using visual semantics. Instead of analyzing the visual cues of web pages, this method utilizes three measures to formulate the visual semantics: layout tree is used to recognize the visual similar blocks; seam degree is used to describe how neatly the blocks are arranged; content similarity is used to describe the content coherent degree between blocks. A comparison experiment was done using the VIPS algorithm as a baseline. Experiment results show that the proposed method can divide a Web page into appropriate semantic segments.

    Original languageEnglish
    Pages (from-to)223-230
    Number of pages8
    JournalIEICE Transactions on Information and Systems
    VolumeE97-D
    Issue number2
    DOIs
    Publication statusPublished - 2014

    All Science Journal Classification (ASJC) codes

    • Software
    • Hardware and Architecture
    • Computer Vision and Pattern Recognition
    • Electrical and Electronic Engineering
    • Artificial Intelligence

    Fingerprint

    Dive into the research topics of 'A web page segmentation approach using visual semantics'. Together they form a unique fingerprint.

    Cite this