Structural analysis of mathematical formulae with verification based on formula Description Grammar

Seiichi Toyota, Seiichi Uchida, Masakazu Suzuki

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

In this paper, a reliable and efficient structural analysis method for mathematical formulae is proposed for practical mathematical OCR. The proposed method consists of three steps. In the first step, a fast structural analysis algorithm is performed on each mathematical formula to obtain a tree representation of the formula. This step generally provides a correct tree representation but sometimes provides an erroneous representation. Therefore, the tree representation is verified by the following two steps. In the second step, the result of the analysis step, (i.e., a tree representation) is converted into a one-dimensional representation. The third step is a verification step where the one-dimensional representation is parsed by a formula description grammar, which is a context-free grammar specialized for mathematical formulae. If the one-dimensional representation is not accepted by the grammar, the result of the analysis step is detected as an erroneous result and alarmed to OCR users. This three-step organization achieves reliable and efficient structural analysis without any two-dimensional grammars.

Original languageEnglish
Title of host publicationDocument Analysis Systems VII - 7th International Workshop, DAS 2006, Proceedings
Pages153-163
Number of pages11
DOIs
Publication statusPublished - Jul 7 2006
Event7th International Workshop on Document Analysis Systems, DAS 2006 - Nelson, New Zealand
Duration: Feb 13 2006Feb 15 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3872 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other7th International Workshop on Document Analysis Systems, DAS 2006
CountryNew Zealand
CityNelson
Period2/13/062/15/06

Fingerprint

Structural Analysis
Structural analysis
Grammar
Optical character recognition
Context free grammars
Context-free Grammar

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Toyota, S., Uchida, S., & Suzuki, M. (2006). Structural analysis of mathematical formulae with verification based on formula Description Grammar. In Document Analysis Systems VII - 7th International Workshop, DAS 2006, Proceedings (pp. 153-163). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3872 LNCS). https://doi.org/10.1007/11669487_14

Structural analysis of mathematical formulae with verification based on formula Description Grammar. / Toyota, Seiichi; Uchida, Seiichi; Suzuki, Masakazu.

Document Analysis Systems VII - 7th International Workshop, DAS 2006, Proceedings. 2006. p. 153-163 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3872 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Toyota, S, Uchida, S & Suzuki, M 2006, Structural analysis of mathematical formulae with verification based on formula Description Grammar. in Document Analysis Systems VII - 7th International Workshop, DAS 2006, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 3872 LNCS, pp. 153-163, 7th International Workshop on Document Analysis Systems, DAS 2006, Nelson, New Zealand, 2/13/06. https://doi.org/10.1007/11669487_14
Toyota S, Uchida S, Suzuki M. Structural analysis of mathematical formulae with verification based on formula Description Grammar. In Document Analysis Systems VII - 7th International Workshop, DAS 2006, Proceedings. 2006. p. 153-163. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/11669487_14
Toyota, Seiichi ; Uchida, Seiichi ; Suzuki, Masakazu. / Structural analysis of mathematical formulae with verification based on formula Description Grammar. Document Analysis Systems VII - 7th International Workshop, DAS 2006, Proceedings. 2006. pp. 153-163 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{1a5081a6229b4b8dbe14f9d61958d9ca,
title = "Structural analysis of mathematical formulae with verification based on formula Description Grammar",
abstract = "In this paper, a reliable and efficient structural analysis method for mathematical formulae is proposed for practical mathematical OCR. The proposed method consists of three steps. In the first step, a fast structural analysis algorithm is performed on each mathematical formula to obtain a tree representation of the formula. This step generally provides a correct tree representation but sometimes provides an erroneous representation. Therefore, the tree representation is verified by the following two steps. In the second step, the result of the analysis step, (i.e., a tree representation) is converted into a one-dimensional representation. The third step is a verification step where the one-dimensional representation is parsed by a formula description grammar, which is a context-free grammar specialized for mathematical formulae. If the one-dimensional representation is not accepted by the grammar, the result of the analysis step is detected as an erroneous result and alarmed to OCR users. This three-step organization achieves reliable and efficient structural analysis without any two-dimensional grammars.",
author = "Seiichi Toyota and Seiichi Uchida and Masakazu Suzuki",
year = "2006",
month = "7",
day = "7",
doi = "10.1007/11669487_14",
language = "English",
isbn = "3540321403",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "153--163",
booktitle = "Document Analysis Systems VII - 7th International Workshop, DAS 2006, Proceedings",

}

TY - GEN

T1 - Structural analysis of mathematical formulae with verification based on formula Description Grammar

AU - Toyota, Seiichi

AU - Uchida, Seiichi

AU - Suzuki, Masakazu

PY - 2006/7/7

Y1 - 2006/7/7

N2 - In this paper, a reliable and efficient structural analysis method for mathematical formulae is proposed for practical mathematical OCR. The proposed method consists of three steps. In the first step, a fast structural analysis algorithm is performed on each mathematical formula to obtain a tree representation of the formula. This step generally provides a correct tree representation but sometimes provides an erroneous representation. Therefore, the tree representation is verified by the following two steps. In the second step, the result of the analysis step, (i.e., a tree representation) is converted into a one-dimensional representation. The third step is a verification step where the one-dimensional representation is parsed by a formula description grammar, which is a context-free grammar specialized for mathematical formulae. If the one-dimensional representation is not accepted by the grammar, the result of the analysis step is detected as an erroneous result and alarmed to OCR users. This three-step organization achieves reliable and efficient structural analysis without any two-dimensional grammars.

AB - In this paper, a reliable and efficient structural analysis method for mathematical formulae is proposed for practical mathematical OCR. The proposed method consists of three steps. In the first step, a fast structural analysis algorithm is performed on each mathematical formula to obtain a tree representation of the formula. This step generally provides a correct tree representation but sometimes provides an erroneous representation. Therefore, the tree representation is verified by the following two steps. In the second step, the result of the analysis step, (i.e., a tree representation) is converted into a one-dimensional representation. The third step is a verification step where the one-dimensional representation is parsed by a formula description grammar, which is a context-free grammar specialized for mathematical formulae. If the one-dimensional representation is not accepted by the grammar, the result of the analysis step is detected as an erroneous result and alarmed to OCR users. This three-step organization achieves reliable and efficient structural analysis without any two-dimensional grammars.

UR - http://www.scopus.com/inward/record.url?scp=33745550086&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33745550086&partnerID=8YFLogxK

U2 - 10.1007/11669487_14

DO - 10.1007/11669487_14

M3 - Conference contribution

AN - SCOPUS:33745550086

SN - 3540321403

SN - 9783540321408

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 153

EP - 163

BT - Document Analysis Systems VII - 7th International Workshop, DAS 2006, Proceedings

ER -