Page segmentation using a convolutional neural network with trainable co-occurrence features

Joonho Lee, Hideaki Hayashi, Wataru Ohyama, Seiichi Uchida

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Citations (Scopus)

Abstract

In document analysis, page segmentation is a fundamental task that divides a document image into semantic regions. In addition to local features, such as pixel-wise information, co-occurrence features are also useful for extracting texture-like periodic information for accurate segmentation. However, existing convolutional neural network (CNN)-based methods do not have any mechanisms that explicitly extract co-occurrence features. In this paper, we propose a method for page segmentation using a CNN with trainable multiplication layers (TMLs). The TML is specialized for extracting co-occurrences from feature maps, thereby supporting the detection of objects with similar textures and periodicities. This property is also considered to be effective for document image analysis because of regularity in text line structures, tables, etc. In the experiment, we achieved promising performance on a pixel-wise page segmentation task by combining TMLs with U-Net. The results demonstrate that TMLs can improve performance compared to the original U-Net. The results also demonstrate that TMLs are helpful for detecting regions with periodically repeating features, such as tables and main text.

Original languageEnglish
Title of host publicationProceedings - 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019
PublisherIEEE Computer Society
Pages1023-1028
Number of pages6
ISBN (Electronic)9781728128610
DOIs
Publication statusPublished - Sept 2019
Event15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019 - Sydney, Australia
Duration: Sept 20 2019Sept 25 2019

Publication series

NameProceedings of the International Conference on Document Analysis and Recognition, ICDAR
ISSN (Print)1520-5363

Conference

Conference15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019
Country/TerritoryAustralia
CitySydney
Period9/20/199/25/19

All Science Journal Classification (ASJC) codes

  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Page segmentation using a convolutional neural network with trainable co-occurrence features'. Together they form a unique fingerprint.

Cite this