Toward part-based document image decoding

Wang Song, Seiichi Uchida, Marcus Liwicki

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Document image decoding (DID) is a trial to understand the contents of a whole document without any reference information about font, language, etc. Typically, DID approaches assume the correct segmentation of the document and some a priori knowledge about the language or the script. Unfortunately, this assumption will not hold if we deal with various documents, such as documents with various sized fonts, camera-captured documents, free-layout documents, or historical documents. In this paper, we propose a part-based character identification method where no segmentation into characters is necessary and no a priori information about the document is needed. The approach clusters similar key points and groups frequent neighboring key point clusters. Then a second iteration is performed, i.e., the groups are again clustered and optionally pairs frequent group clusters are detected. Our first experimental results on multi font-size documents look already very promising. We could find nearly perfect correspondences between characters and detected group clusters.

Original languageEnglish
Title of host publicationProceedings - 10th IAPR International Workshop on Document Analysis Systems, DAS 2012
Pages266-270
Number of pages5
DOIs
Publication statusPublished - May 24 2012
Event10th IAPR International Workshop on Document Analysis Systems, DAS 2012 - Gold Coast, QLD, Australia
Duration: Mar 27 2012Mar 29 2012

Publication series

NameProceedings - 10th IAPR International Workshop on Document Analysis Systems, DAS 2012

Other

Other10th IAPR International Workshop on Document Analysis Systems, DAS 2012
CountryAustralia
CityGold Coast, QLD
Period3/27/123/29/12

Fingerprint

Decoding
Cameras

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering

Cite this

Song, W., Uchida, S., & Liwicki, M. (2012). Toward part-based document image decoding. In Proceedings - 10th IAPR International Workshop on Document Analysis Systems, DAS 2012 (pp. 266-270). [6195376] (Proceedings - 10th IAPR International Workshop on Document Analysis Systems, DAS 2012). https://doi.org/10.1109/DAS.2012.90

Toward part-based document image decoding. / Song, Wang; Uchida, Seiichi; Liwicki, Marcus.

Proceedings - 10th IAPR International Workshop on Document Analysis Systems, DAS 2012. 2012. p. 266-270 6195376 (Proceedings - 10th IAPR International Workshop on Document Analysis Systems, DAS 2012).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Song, W, Uchida, S & Liwicki, M 2012, Toward part-based document image decoding. in Proceedings - 10th IAPR International Workshop on Document Analysis Systems, DAS 2012., 6195376, Proceedings - 10th IAPR International Workshop on Document Analysis Systems, DAS 2012, pp. 266-270, 10th IAPR International Workshop on Document Analysis Systems, DAS 2012, Gold Coast, QLD, Australia, 3/27/12. https://doi.org/10.1109/DAS.2012.90
Song W, Uchida S, Liwicki M. Toward part-based document image decoding. In Proceedings - 10th IAPR International Workshop on Document Analysis Systems, DAS 2012. 2012. p. 266-270. 6195376. (Proceedings - 10th IAPR International Workshop on Document Analysis Systems, DAS 2012). https://doi.org/10.1109/DAS.2012.90
Song, Wang ; Uchida, Seiichi ; Liwicki, Marcus. / Toward part-based document image decoding. Proceedings - 10th IAPR International Workshop on Document Analysis Systems, DAS 2012. 2012. pp. 266-270 (Proceedings - 10th IAPR International Workshop on Document Analysis Systems, DAS 2012).
@inproceedings{fac3328fd66543ab9d77d86ca66ccc52,
title = "Toward part-based document image decoding",
abstract = "Document image decoding (DID) is a trial to understand the contents of a whole document without any reference information about font, language, etc. Typically, DID approaches assume the correct segmentation of the document and some a priori knowledge about the language or the script. Unfortunately, this assumption will not hold if we deal with various documents, such as documents with various sized fonts, camera-captured documents, free-layout documents, or historical documents. In this paper, we propose a part-based character identification method where no segmentation into characters is necessary and no a priori information about the document is needed. The approach clusters similar key points and groups frequent neighboring key point clusters. Then a second iteration is performed, i.e., the groups are again clustered and optionally pairs frequent group clusters are detected. Our first experimental results on multi font-size documents look already very promising. We could find nearly perfect correspondences between characters and detected group clusters.",
author = "Wang Song and Seiichi Uchida and Marcus Liwicki",
year = "2012",
month = "5",
day = "24",
doi = "10.1109/DAS.2012.90",
language = "English",
isbn = "9780769546612",
series = "Proceedings - 10th IAPR International Workshop on Document Analysis Systems, DAS 2012",
pages = "266--270",
booktitle = "Proceedings - 10th IAPR International Workshop on Document Analysis Systems, DAS 2012",

}

TY - GEN

T1 - Toward part-based document image decoding

AU - Song, Wang

AU - Uchida, Seiichi

AU - Liwicki, Marcus

PY - 2012/5/24

Y1 - 2012/5/24

N2 - Document image decoding (DID) is a trial to understand the contents of a whole document without any reference information about font, language, etc. Typically, DID approaches assume the correct segmentation of the document and some a priori knowledge about the language or the script. Unfortunately, this assumption will not hold if we deal with various documents, such as documents with various sized fonts, camera-captured documents, free-layout documents, or historical documents. In this paper, we propose a part-based character identification method where no segmentation into characters is necessary and no a priori information about the document is needed. The approach clusters similar key points and groups frequent neighboring key point clusters. Then a second iteration is performed, i.e., the groups are again clustered and optionally pairs frequent group clusters are detected. Our first experimental results on multi font-size documents look already very promising. We could find nearly perfect correspondences between characters and detected group clusters.

AB - Document image decoding (DID) is a trial to understand the contents of a whole document without any reference information about font, language, etc. Typically, DID approaches assume the correct segmentation of the document and some a priori knowledge about the language or the script. Unfortunately, this assumption will not hold if we deal with various documents, such as documents with various sized fonts, camera-captured documents, free-layout documents, or historical documents. In this paper, we propose a part-based character identification method where no segmentation into characters is necessary and no a priori information about the document is needed. The approach clusters similar key points and groups frequent neighboring key point clusters. Then a second iteration is performed, i.e., the groups are again clustered and optionally pairs frequent group clusters are detected. Our first experimental results on multi font-size documents look already very promising. We could find nearly perfect correspondences between characters and detected group clusters.

UR - http://www.scopus.com/inward/record.url?scp=84862074296&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84862074296&partnerID=8YFLogxK

U2 - 10.1109/DAS.2012.90

DO - 10.1109/DAS.2012.90

M3 - Conference contribution

AN - SCOPUS:84862074296

SN - 9780769546612

T3 - Proceedings - 10th IAPR International Workshop on Document Analysis Systems, DAS 2012

SP - 266

EP - 270

BT - Proceedings - 10th IAPR International Workshop on Document Analysis Systems, DAS 2012

ER -