A modality converting approach for image annotation to overcome the inconsistent labels in training data

研究成果: Chapter in Book/Report/Conference proceedingConference contribution

抄録

The automatic image annotation (AIA) task, in which a system specifies descriptive keywords for an input image, has been a shared task studied for long time, and still important because the annotation keywords enables users efficient access of ever-growing image data. However, the current performance of the AIA systems remains at low levels. One of the difficulties of the AIA comes from inconsistency of annotation keywords in the training data, which is naturally occurred in manual annotations, for many supervised methods. For example, annotation keywords for images of people may be “tourist” or “woman” depending on scenes of the images. This inconsistency makes it difficult to annotate images, which possibly have such similar keywords. For that difficulty, we propose a modality converting method that transforms an input image into an encyclopedic text of keywords assigned to the image. With the modality converting, similar keywords can share their features derived from texts with each other. In the proposed method, we pair images with Wikipedia articles, which have annotation keywords as their titles. We train a modality convertor from images to Wikipedia texts using a neural network with the paired data. Then, the method classifies the converted text into annotation keywords similar to the text classification. Experimental results show relatively high performance of our method based on the converted text compared with existing methods.

本文言語英語
ホスト出版物のタイトルPattern Recognition - ICPR International Workshops and Challenges, Proceedings
編集者Alberto Del Bimbo, Marco Bertini, Stan Sclaroff, Tao Mei, Hugo Jair Escalante, Rita Cucchiara, Roberto Vezzani, Giovanni Maria Farinella
出版社Springer Science and Business Media Deutschland GmbH
ページ261-268
ページ数8
ISBN(印刷版)9783030687892
DOI
出版ステータス出版済み - 2021
イベント25th International Conference on Pattern Recognition Workshops, ICPR 2020 - Virtual, Online
継続期間: 1 10 20211 15 2021

出版物シリーズ

名前Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
12662 LNCS
ISSN(印刷版)0302-9743
ISSN(電子版)1611-3349

会議

会議25th International Conference on Pattern Recognition Workshops, ICPR 2020
CityVirtual, Online
Period1/10/211/15/21

All Science Journal Classification (ASJC) codes

  • 理論的コンピュータサイエンス
  • コンピュータ サイエンス(全般)

フィンガープリント

「A modality converting approach for image annotation to overcome the inconsistent labels in training data」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル