Quantitative analysis of mathematical documents

Seiichi Uchida, Akihiro Nomura, Masakazu Suzuki

研究成果: ジャーナルへの寄稿記事

20 引用 (Scopus)

抄録

Mathematical documents are analyzed from several viewpoints for the development of practical OCR for mathematical and other scientific documents. Specifically, four viewpoints are quantified using a large-scale database of mathematical documents, containing 690,000 manually ground-truthed characters: (i) the number of character categories, (ii) abnormal characters (e.g., touching characters), (iii) character size variation, and (iv) the complexity of the mathematical expressions. The result of these analyses clarifies the difficulties of recognizing mathematical documents and then suggests several promising directions to overcome them.

元の言語英語
ページ(範囲)211-218
ページ数8
ジャーナルInternational Journal on Document Analysis and Recognition
7
発行部数4
DOI
出版物ステータス出版済み - 9 1 2005

Fingerprint

Optical character recognition
Chemical analysis

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Vision and Pattern Recognition
  • Computer Science Applications

これを引用

Quantitative analysis of mathematical documents. / Uchida, Seiichi; Nomura, Akihiro; Suzuki, Masakazu.

:: International Journal on Document Analysis and Recognition, 巻 7, 番号 4, 01.09.2005, p. 211-218.

研究成果: ジャーナルへの寄稿記事

Uchida, Seiichi ; Nomura, Akihiro ; Suzuki, Masakazu. / Quantitative analysis of mathematical documents. :: International Journal on Document Analysis and Recognition. 2005 ; 巻 7, 番号 4. pp. 211-218.
@article{b04d10a88d154708a0f947d45bf029a2,
title = "Quantitative analysis of mathematical documents",
abstract = "Mathematical documents are analyzed from several viewpoints for the development of practical OCR for mathematical and other scientific documents. Specifically, four viewpoints are quantified using a large-scale database of mathematical documents, containing 690,000 manually ground-truthed characters: (i) the number of character categories, (ii) abnormal characters (e.g., touching characters), (iii) character size variation, and (iv) the complexity of the mathematical expressions. The result of these analyses clarifies the difficulties of recognizing mathematical documents and then suggests several promising directions to overcome them.",
author = "Seiichi Uchida and Akihiro Nomura and Masakazu Suzuki",
year = "2005",
month = "9",
day = "1",
doi = "10.1007/s10032-005-0142-y",
language = "English",
volume = "7",
pages = "211--218",
journal = "International Journal on Document Analysis and Recognition",
issn = "1433-2833",
publisher = "Springer Verlag",
number = "4",

}

TY - JOUR

T1 - Quantitative analysis of mathematical documents

AU - Uchida, Seiichi

AU - Nomura, Akihiro

AU - Suzuki, Masakazu

PY - 2005/9/1

Y1 - 2005/9/1

N2 - Mathematical documents are analyzed from several viewpoints for the development of practical OCR for mathematical and other scientific documents. Specifically, four viewpoints are quantified using a large-scale database of mathematical documents, containing 690,000 manually ground-truthed characters: (i) the number of character categories, (ii) abnormal characters (e.g., touching characters), (iii) character size variation, and (iv) the complexity of the mathematical expressions. The result of these analyses clarifies the difficulties of recognizing mathematical documents and then suggests several promising directions to overcome them.

AB - Mathematical documents are analyzed from several viewpoints for the development of practical OCR for mathematical and other scientific documents. Specifically, four viewpoints are quantified using a large-scale database of mathematical documents, containing 690,000 manually ground-truthed characters: (i) the number of character categories, (ii) abnormal characters (e.g., touching characters), (iii) character size variation, and (iv) the complexity of the mathematical expressions. The result of these analyses clarifies the difficulties of recognizing mathematical documents and then suggests several promising directions to overcome them.

UR - http://www.scopus.com/inward/record.url?scp=29444441304&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=29444441304&partnerID=8YFLogxK

U2 - 10.1007/s10032-005-0142-y

DO - 10.1007/s10032-005-0142-y

M3 - Article

AN - SCOPUS:29444441304

VL - 7

SP - 211

EP - 218

JO - International Journal on Document Analysis and Recognition

JF - International Journal on Document Analysis and Recognition

SN - 1433-2833

IS - 4

ER -