TY - GEN
T1 - Selective super-resolution for scene text images
AU - Nakao, Ryo
AU - Iwana, Brian Kenji
AU - Uchida, Seiichi
N1 - Funding Information:
ACKNOWLEDGEMENT This work was supported by JSPS KAKENHI Grant Number JP17H06100. REFERENCES
Publisher Copyright:
© 2019 IEEE.
PY - 2019/9
Y1 - 2019/9
N2 - In this paper, we realize the enhancement of super-resolution using images with scene text. Specifically, this paper proposes the use of Super-Resolution Convolutional Neural Networks (SRCNN) which are constructed to tackle issues associated with characters and text. We demonstrate that standard SRCNNs trained for general object super-resolution is not sufficient and that the proposed method is a viable method in creating a robust model for text. To do so, we analyze the characteristics of SRCNNs through quantitative and qualitative evaluations with scene text data. In addition, analysis using the correlation between layers by Singular Vector Canonical Correlation Analysis (SVCCA) and comparison of filters of each SRCNN using t-SNE is performed. Furthermore, in order to create a unified super-resolution model specialized for both text and objects, a model using SRCNNs trained with the different data types and Content-wise Network Fusion (CNF) is used. We integrate the SRCNN trained for character images and then SRCNN trained for general object images, and verify the accuracy improvement of scene images which include text. We also examine how each SRCNN affects super-resolution images after fusion.
AB - In this paper, we realize the enhancement of super-resolution using images with scene text. Specifically, this paper proposes the use of Super-Resolution Convolutional Neural Networks (SRCNN) which are constructed to tackle issues associated with characters and text. We demonstrate that standard SRCNNs trained for general object super-resolution is not sufficient and that the proposed method is a viable method in creating a robust model for text. To do so, we analyze the characteristics of SRCNNs through quantitative and qualitative evaluations with scene text data. In addition, analysis using the correlation between layers by Singular Vector Canonical Correlation Analysis (SVCCA) and comparison of filters of each SRCNN using t-SNE is performed. Furthermore, in order to create a unified super-resolution model specialized for both text and objects, a model using SRCNNs trained with the different data types and Content-wise Network Fusion (CNF) is used. We integrate the SRCNN trained for character images and then SRCNN trained for general object images, and verify the accuracy improvement of scene images which include text. We also examine how each SRCNN affects super-resolution images after fusion.
UR - http://www.scopus.com/inward/record.url?scp=85079846870&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85079846870&partnerID=8YFLogxK
U2 - 10.1109/ICDAR.2019.00071
DO - 10.1109/ICDAR.2019.00071
M3 - Conference contribution
AN - SCOPUS:85079846870
T3 - Proceedings of the International Conference on Document Analysis and Recognition, ICDAR
SP - 401
EP - 406
BT - Proceedings - 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019
PB - IEEE Computer Society
T2 - 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019
Y2 - 20 September 2019 through 25 September 2019
ER -