CNN training with graph-based sample preselection: application to handwritten character recognition

Frederic Rayar, Masanori Goto, Seiichi Uchida

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we present a study on sample preselection in large training data set for CNN-based classification. To do so, we structure the input data set in a network representation, namely the Relative Neighbourhood Graph, and then extract some vectors of interest. The proposed preselection method is evaluated in the context of handwritten character recognition, by using two data sets, up to several hundred thousands of images. It is shown that the graph-based preselection can reduce the training data set without degrading the recognition accuracy of a non pretrained CNN shallow model.

Original languageEnglish
Title of host publicationProceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages19-24
Number of pages6
ISBN (Electronic)9781538633465
DOIs
Publication statusPublished - Jun 22 2018
Event13th IAPR International Workshop on Document Analysis Systems, DAS 2018 - Vienna, Austria
Duration: Apr 24 2018Apr 27 2018

Publication series

NameProceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018

Other

Other13th IAPR International Workshop on Document Analysis Systems, DAS 2018
CountryAustria
CityVienna
Period4/24/184/27/18

Fingerprint

Character recognition

All Science Journal Classification (ASJC) codes

  • Computer Vision and Pattern Recognition
  • Signal Processing

Cite this

Rayar, F., Goto, M., & Uchida, S. (2018). CNN training with graph-based sample preselection: application to handwritten character recognition. In Proceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018 (pp. 19-24). (Proceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/DAS.2018.10

CNN training with graph-based sample preselection : application to handwritten character recognition. / Rayar, Frederic; Goto, Masanori; Uchida, Seiichi.

Proceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018. Institute of Electrical and Electronics Engineers Inc., 2018. p. 19-24 (Proceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Rayar, F, Goto, M & Uchida, S 2018, CNN training with graph-based sample preselection: application to handwritten character recognition. in Proceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018. Proceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018, Institute of Electrical and Electronics Engineers Inc., pp. 19-24, 13th IAPR International Workshop on Document Analysis Systems, DAS 2018, Vienna, Austria, 4/24/18. https://doi.org/10.1109/DAS.2018.10
Rayar F, Goto M, Uchida S. CNN training with graph-based sample preselection: application to handwritten character recognition. In Proceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018. Institute of Electrical and Electronics Engineers Inc. 2018. p. 19-24. (Proceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018). https://doi.org/10.1109/DAS.2018.10
Rayar, Frederic ; Goto, Masanori ; Uchida, Seiichi. / CNN training with graph-based sample preselection : application to handwritten character recognition. Proceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018. Institute of Electrical and Electronics Engineers Inc., 2018. pp. 19-24 (Proceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018).
@inproceedings{5159bd03b25642828ae9341eb2fa040a,
title = "CNN training with graph-based sample preselection: application to handwritten character recognition",
abstract = "In this paper, we present a study on sample preselection in large training data set for CNN-based classification. To do so, we structure the input data set in a network representation, namely the Relative Neighbourhood Graph, and then extract some vectors of interest. The proposed preselection method is evaluated in the context of handwritten character recognition, by using two data sets, up to several hundred thousands of images. It is shown that the graph-based preselection can reduce the training data set without degrading the recognition accuracy of a non pretrained CNN shallow model.",
author = "Frederic Rayar and Masanori Goto and Seiichi Uchida",
year = "2018",
month = "6",
day = "22",
doi = "10.1109/DAS.2018.10",
language = "English",
series = "Proceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "19--24",
booktitle = "Proceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018",
address = "United States",

}

TY - GEN

T1 - CNN training with graph-based sample preselection

T2 - application to handwritten character recognition

AU - Rayar, Frederic

AU - Goto, Masanori

AU - Uchida, Seiichi

PY - 2018/6/22

Y1 - 2018/6/22

N2 - In this paper, we present a study on sample preselection in large training data set for CNN-based classification. To do so, we structure the input data set in a network representation, namely the Relative Neighbourhood Graph, and then extract some vectors of interest. The proposed preselection method is evaluated in the context of handwritten character recognition, by using two data sets, up to several hundred thousands of images. It is shown that the graph-based preselection can reduce the training data set without degrading the recognition accuracy of a non pretrained CNN shallow model.

AB - In this paper, we present a study on sample preselection in large training data set for CNN-based classification. To do so, we structure the input data set in a network representation, namely the Relative Neighbourhood Graph, and then extract some vectors of interest. The proposed preselection method is evaluated in the context of handwritten character recognition, by using two data sets, up to several hundred thousands of images. It is shown that the graph-based preselection can reduce the training data set without degrading the recognition accuracy of a non pretrained CNN shallow model.

UR - http://www.scopus.com/inward/record.url?scp=85050259581&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85050259581&partnerID=8YFLogxK

U2 - 10.1109/DAS.2018.10

DO - 10.1109/DAS.2018.10

M3 - Conference contribution

AN - SCOPUS:85050259581

T3 - Proceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018

SP - 19

EP - 24

BT - Proceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018

PB - Institute of Electrical and Electronics Engineers Inc.

ER -