Visual saliency models for text detection in real world

Renwu Gao, Seiichi Uchida, Asif Shahab, Faisal Shafait, Volkmar Frinken

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

This paper evaluates the degree of saliency of texts in natural scenes using visual saliency models. A large scale scene image database with pixel level ground truth is created for this purpose. Using this scene image database and five state-of-theart models, visual saliency maps that represent the degree of saliency of the objects are calculated. The receiver operating characteristic curve is employed in order to evaluate the saliency of scene texts, which is calculated by visual saliency models. A visualization of the distribution of scene texts and non-texts in the space constructed by three kinds of saliency maps, which are calculated using Itti's visual saliency model with intensity, color and orientation features, is given. This visualization of distribution indicates that text characters are more salient than their non-text neighbors, and can be captured from the background. Therefore, scene texts can be extracted from the scene images. With this in mind, a new visual saliency architecture, named hierarchical visual saliency model, is proposed. Hierarchical visual saliency model is based on Itti's model and consists of two stages. In the first stage, Itti's model is used to calculate the saliency map, and Otsu's global thresholding algorithm is applied to extract the salient region that we are interested in. In the second stage, Itti's model is applied to the salient region to calculate the final saliency map. An experimental evaluation demonstrates that the proposed model outperforms Itti's model in terms of captured scene texts.

Original languageEnglish
Article numbere114539
JournalPloS one
Volume9
Issue number12
DOIs
Publication statusPublished - Dec 10 2014

Fingerprint

Databases
ROC Curve
Color
Visualization
Pixels
color
extracts

All Science Journal Classification (ASJC) codes

  • Biochemistry, Genetics and Molecular Biology(all)
  • Agricultural and Biological Sciences(all)

Cite this

Visual saliency models for text detection in real world. / Gao, Renwu; Uchida, Seiichi; Shahab, Asif; Shafait, Faisal; Frinken, Volkmar.

In: PloS one, Vol. 9, No. 12, e114539, 10.12.2014.

Research output: Contribution to journalArticle

Gao, R, Uchida, S, Shahab, A, Shafait, F & Frinken, V 2014, 'Visual saliency models for text detection in real world', PloS one, vol. 9, no. 12, e114539. https://doi.org/10.1371/journal.pone.0114539
Gao, Renwu ; Uchida, Seiichi ; Shahab, Asif ; Shafait, Faisal ; Frinken, Volkmar. / Visual saliency models for text detection in real world. In: PloS one. 2014 ; Vol. 9, No. 12.
@article{0b9dfdd9f69d4779a9e40b85bc3e9b5f,
title = "Visual saliency models for text detection in real world",
abstract = "This paper evaluates the degree of saliency of texts in natural scenes using visual saliency models. A large scale scene image database with pixel level ground truth is created for this purpose. Using this scene image database and five state-of-theart models, visual saliency maps that represent the degree of saliency of the objects are calculated. The receiver operating characteristic curve is employed in order to evaluate the saliency of scene texts, which is calculated by visual saliency models. A visualization of the distribution of scene texts and non-texts in the space constructed by three kinds of saliency maps, which are calculated using Itti's visual saliency model with intensity, color and orientation features, is given. This visualization of distribution indicates that text characters are more salient than their non-text neighbors, and can be captured from the background. Therefore, scene texts can be extracted from the scene images. With this in mind, a new visual saliency architecture, named hierarchical visual saliency model, is proposed. Hierarchical visual saliency model is based on Itti's model and consists of two stages. In the first stage, Itti's model is used to calculate the saliency map, and Otsu's global thresholding algorithm is applied to extract the salient region that we are interested in. In the second stage, Itti's model is applied to the salient region to calculate the final saliency map. An experimental evaluation demonstrates that the proposed model outperforms Itti's model in terms of captured scene texts.",
author = "Renwu Gao and Seiichi Uchida and Asif Shahab and Faisal Shafait and Volkmar Frinken",
year = "2014",
month = "12",
day = "10",
doi = "10.1371/journal.pone.0114539",
language = "English",
volume = "9",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "12",

}

TY - JOUR

T1 - Visual saliency models for text detection in real world

AU - Gao, Renwu

AU - Uchida, Seiichi

AU - Shahab, Asif

AU - Shafait, Faisal

AU - Frinken, Volkmar

PY - 2014/12/10

Y1 - 2014/12/10

N2 - This paper evaluates the degree of saliency of texts in natural scenes using visual saliency models. A large scale scene image database with pixel level ground truth is created for this purpose. Using this scene image database and five state-of-theart models, visual saliency maps that represent the degree of saliency of the objects are calculated. The receiver operating characteristic curve is employed in order to evaluate the saliency of scene texts, which is calculated by visual saliency models. A visualization of the distribution of scene texts and non-texts in the space constructed by three kinds of saliency maps, which are calculated using Itti's visual saliency model with intensity, color and orientation features, is given. This visualization of distribution indicates that text characters are more salient than their non-text neighbors, and can be captured from the background. Therefore, scene texts can be extracted from the scene images. With this in mind, a new visual saliency architecture, named hierarchical visual saliency model, is proposed. Hierarchical visual saliency model is based on Itti's model and consists of two stages. In the first stage, Itti's model is used to calculate the saliency map, and Otsu's global thresholding algorithm is applied to extract the salient region that we are interested in. In the second stage, Itti's model is applied to the salient region to calculate the final saliency map. An experimental evaluation demonstrates that the proposed model outperforms Itti's model in terms of captured scene texts.

AB - This paper evaluates the degree of saliency of texts in natural scenes using visual saliency models. A large scale scene image database with pixel level ground truth is created for this purpose. Using this scene image database and five state-of-theart models, visual saliency maps that represent the degree of saliency of the objects are calculated. The receiver operating characteristic curve is employed in order to evaluate the saliency of scene texts, which is calculated by visual saliency models. A visualization of the distribution of scene texts and non-texts in the space constructed by three kinds of saliency maps, which are calculated using Itti's visual saliency model with intensity, color and orientation features, is given. This visualization of distribution indicates that text characters are more salient than their non-text neighbors, and can be captured from the background. Therefore, scene texts can be extracted from the scene images. With this in mind, a new visual saliency architecture, named hierarchical visual saliency model, is proposed. Hierarchical visual saliency model is based on Itti's model and consists of two stages. In the first stage, Itti's model is used to calculate the saliency map, and Otsu's global thresholding algorithm is applied to extract the salient region that we are interested in. In the second stage, Itti's model is applied to the salient region to calculate the final saliency map. An experimental evaluation demonstrates that the proposed model outperforms Itti's model in terms of captured scene texts.

UR - http://www.scopus.com/inward/record.url?scp=84916620538&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84916620538&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0114539

DO - 10.1371/journal.pone.0114539

M3 - Article

VL - 9

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 12

M1 - e114539

ER -