Quantitative analysis of mathematical documents

Seiichi Uchida, Akihiro Nomura, Masakazu Suzuki

Research output: Contribution to journalArticle

20 Citations (Scopus)

Abstract

Mathematical documents are analyzed from several viewpoints for the development of practical OCR for mathematical and other scientific documents. Specifically, four viewpoints are quantified using a large-scale database of mathematical documents, containing 690,000 manually ground-truthed characters: (i) the number of character categories, (ii) abnormal characters (e.g., touching characters), (iii) character size variation, and (iv) the complexity of the mathematical expressions. The result of these analyses clarifies the difficulties of recognizing mathematical documents and then suggests several promising directions to overcome them.

Original languageEnglish
Pages (from-to)211-218
Number of pages8
JournalInternational Journal on Document Analysis and Recognition
Volume7
Issue number4
DOIs
Publication statusPublished - Sep 1 2005

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Vision and Pattern Recognition
  • Computer Science Applications

Fingerprint Dive into the research topics of 'Quantitative analysis of mathematical documents'. Together they form a unique fingerprint.

  • Cite this