Mathematical Document Categorization with Structure of Mathematical Expressions

Tokinori Suzuki, Atsushi Fujii

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

A mathematical document is a document subjected to mathematical communication, for example, a math paper and discussion in online Q&A community. Mathematical document categorization (MDC) is a task to classify mathematical documents to mathematical categories, e.g. probability theory and set theory. This task is an important task for supporting user search on recent wide-spreaded digital libraries and archiving services. Although Mathematical expressions (ME) in the document could bring an essential information as being in a central part of communication especially in math fields, how to utilize ME for MDC has not been matured. In this paper, we propose the classi cation method based on text combined with structures of ME, which are supposed to re ect conventions and rules specific to a category. Also, we present document collections built for evaluating the MDC systems, with investigation on categorial settings and its statistics. We demonstrate classi cation results that our proposed method outperforms existing methods with state-of-the-art ME modeling on F-measure.

Original languageEnglish
Title of host publication2017 ACM/IEEE Joint Conference on Digital Libraries, JCDL 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781538638613
DOIs
Publication statusPublished - Jul 25 2017
Externally publishedYes
Event17th ACM/IEEE Joint Conference on Digital Libraries, JCDL 2017 - Toronto, Canada
Duration: Jun 19 2017Jun 23 2017

Publication series

NameProceedings of the ACM/IEEE Joint Conference on Digital Libraries
ISSN (Print)1552-5996

Conference

Conference17th ACM/IEEE Joint Conference on Digital Libraries, JCDL 2017
CountryCanada
CityToronto
Period6/19/176/23/17

All Science Journal Classification (ASJC) codes

  • Engineering(all)

Fingerprint Dive into the research topics of 'Mathematical Document Categorization with Structure of Mathematical Expressions'. Together they form a unique fingerprint.

Cite this