Identifying subscripts and superscripts in mathematical documents

Walaa Aly, Seiichi Uchida, Masakazu Suzuki

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

In mathematical OCR, it is necessary to analyze two-dimensional structures of the component characters and symbols in mathematical expressions printed in scientific documents. In this paper, we analyze the positional relationships between adjacent characters for the purpose of automatic discrimination between baseline characters, subscripts, and superscripts, which is one of the most important and delicate parts of structure analysis. It has been proven through a large-scale experiment that this discrimination can be carried out almost perfectly (~99.89%) by using the relative size and position of adjacent characters.

Original languageEnglish
Pages (from-to)195-209
Number of pages15
JournalMathematics in Computer Science
Volume2
Issue number2
DOIs
Publication statusPublished - Dec 1 2008

Fingerprint

Superscript
Subscript
Optical character recognition
Discrimination
Adjacent
Experiments
Baseline
Necessary
Character
Experiment

All Science Journal Classification (ASJC) codes

  • Computational Mathematics
  • Computational Theory and Mathematics
  • Applied Mathematics

Cite this

Identifying subscripts and superscripts in mathematical documents. / Aly, Walaa; Uchida, Seiichi; Suzuki, Masakazu.

In: Mathematics in Computer Science, Vol. 2, No. 2, 01.12.2008, p. 195-209.

Research output: Contribution to journalArticle

@article{88e63bb98348404c89750276f5d1264d,
title = "Identifying subscripts and superscripts in mathematical documents",
abstract = "In mathematical OCR, it is necessary to analyze two-dimensional structures of the component characters and symbols in mathematical expressions printed in scientific documents. In this paper, we analyze the positional relationships between adjacent characters for the purpose of automatic discrimination between baseline characters, subscripts, and superscripts, which is one of the most important and delicate parts of structure analysis. It has been proven through a large-scale experiment that this discrimination can be carried out almost perfectly (~99.89{\%}) by using the relative size and position of adjacent characters.",
author = "Walaa Aly and Seiichi Uchida and Masakazu Suzuki",
year = "2008",
month = "12",
day = "1",
doi = "10.1007/s11786-008-0051-9",
language = "English",
volume = "2",
pages = "195--209",
journal = "Mathematics in Computer Science",
issn = "1661-8270",
publisher = "Birkhauser Verlag Basel",
number = "2",

}

TY - JOUR

T1 - Identifying subscripts and superscripts in mathematical documents

AU - Aly, Walaa

AU - Uchida, Seiichi

AU - Suzuki, Masakazu

PY - 2008/12/1

Y1 - 2008/12/1

N2 - In mathematical OCR, it is necessary to analyze two-dimensional structures of the component characters and symbols in mathematical expressions printed in scientific documents. In this paper, we analyze the positional relationships between adjacent characters for the purpose of automatic discrimination between baseline characters, subscripts, and superscripts, which is one of the most important and delicate parts of structure analysis. It has been proven through a large-scale experiment that this discrimination can be carried out almost perfectly (~99.89%) by using the relative size and position of adjacent characters.

AB - In mathematical OCR, it is necessary to analyze two-dimensional structures of the component characters and symbols in mathematical expressions printed in scientific documents. In this paper, we analyze the positional relationships between adjacent characters for the purpose of automatic discrimination between baseline characters, subscripts, and superscripts, which is one of the most important and delicate parts of structure analysis. It has been proven through a large-scale experiment that this discrimination can be carried out almost perfectly (~99.89%) by using the relative size and position of adjacent characters.

UR - http://www.scopus.com/inward/record.url?scp=69949190583&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=69949190583&partnerID=8YFLogxK

U2 - 10.1007/s11786-008-0051-9

DO - 10.1007/s11786-008-0051-9

M3 - Article

AN - SCOPUS:69949190583

VL - 2

SP - 195

EP - 209

JO - Mathematics in Computer Science

JF - Mathematics in Computer Science

SN - 1661-8270

IS - 2

ER -