TY - GEN
T1 - Structural analysis of mathematical formulae with verification based on formula Description Grammar
AU - Toyota, Seiichi
AU - Uchida, Seiichi
AU - Suzuki, Masakazu
PY - 2006/7/7
Y1 - 2006/7/7
N2 - In this paper, a reliable and efficient structural analysis method for mathematical formulae is proposed for practical mathematical OCR. The proposed method consists of three steps. In the first step, a fast structural analysis algorithm is performed on each mathematical formula to obtain a tree representation of the formula. This step generally provides a correct tree representation but sometimes provides an erroneous representation. Therefore, the tree representation is verified by the following two steps. In the second step, the result of the analysis step, (i.e., a tree representation) is converted into a one-dimensional representation. The third step is a verification step where the one-dimensional representation is parsed by a formula description grammar, which is a context-free grammar specialized for mathematical formulae. If the one-dimensional representation is not accepted by the grammar, the result of the analysis step is detected as an erroneous result and alarmed to OCR users. This three-step organization achieves reliable and efficient structural analysis without any two-dimensional grammars.
AB - In this paper, a reliable and efficient structural analysis method for mathematical formulae is proposed for practical mathematical OCR. The proposed method consists of three steps. In the first step, a fast structural analysis algorithm is performed on each mathematical formula to obtain a tree representation of the formula. This step generally provides a correct tree representation but sometimes provides an erroneous representation. Therefore, the tree representation is verified by the following two steps. In the second step, the result of the analysis step, (i.e., a tree representation) is converted into a one-dimensional representation. The third step is a verification step where the one-dimensional representation is parsed by a formula description grammar, which is a context-free grammar specialized for mathematical formulae. If the one-dimensional representation is not accepted by the grammar, the result of the analysis step is detected as an erroneous result and alarmed to OCR users. This three-step organization achieves reliable and efficient structural analysis without any two-dimensional grammars.
UR - http://www.scopus.com/inward/record.url?scp=33745550086&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33745550086&partnerID=8YFLogxK
U2 - 10.1007/11669487_14
DO - 10.1007/11669487_14
M3 - Conference contribution
AN - SCOPUS:33745550086
SN - 3540321403
SN - 9783540321408
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 153
EP - 163
BT - Document Analysis Systems VII - 7th International Workshop, DAS 2006, Proceedings
T2 - 7th International Workshop on Document Analysis Systems, DAS 2006
Y2 - 13 February 2006 through 15 February 2006
ER -