Proposal of seam degree and content similarity for web page segmentation

Jun Zeng, Brendan Flanagan, Qingyu Xiong, Junhao Wen, Sachio Hirokawa

    研究成果: 書籍/レポート タイプへの寄稿会議への寄与

    抄録

    Page segmentation has received great attention in recent years. However, most research has been based on some pre-defined heuristics or visual cues which may be not suitable for large-scale page segmentation. In this paper, we proposed two parameters: seam degree and content similarity, to indicate the coherent degree of a page block. Instead of analyzing pre-defined heuristics or visual cues, our method utilizes the visual and content features to determine whether a page block should be divided into smaller blocks. We also proposed a principled page segmentation method using these two parameters. An experiment was conducted to determine the relationship between the two parameters and the number of segment results. The empirical results also show that our segmentation method can effectively segment a page into different semantic parts.

    本文言語英語
    ホスト出版物のタイトルProceedings - 2nd IIAI International Conference on Advanced Applied Informatics, IIAI-AAI 2013
    ページ9-14
    ページ数6
    DOI
    出版ステータス出版済み - 2013
    イベント2nd IIAI International Conference on Advanced Applied Informatics, IIAI-AAI 2013 - Matsue, 日本
    継続期間: 8月 31 20139月 4 2013

    出版物シリーズ

    名前Proceedings - 2nd IIAI International Conference on Advanced Applied Informatics, IIAI-AAI 2013

    その他

    その他2nd IIAI International Conference on Advanced Applied Informatics, IIAI-AAI 2013
    国/地域日本
    CityMatsue
    Period8/31/139/4/13

    !!!All Science Journal Classification (ASJC) codes

    • 情報システム

    フィンガープリント

    「Proposal of seam degree and content similarity for web page segmentation」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

    引用スタイル