Semi-supervised coupled dictionary learning for cross-modal retrieval in internet images and texts

Xing Xu, Yang Yang, Atsushi Shimada, Rin Ichiro Taniguchi, Li He

Research output: Chapter in Book/Report/Conference proceedingConference contribution

20 Citations (Scopus)

Abstract

Nowadays massive amount of images and texts has been emerging on the Internet, arousing the demand of effective cross-modal retrieval. To eliminate the heterogeneity be-tween the modalities of images and texts, the existing sub-space learning methods try to learn a common latent sub-space under which cross-modal matching can be performed. However, these methods usually require fully paired sam-ples (images with corresponding texts) and also ignore the class label information along with the paired samples. In-deed, the class label information can reduce the semantic gap between different modalities and explicitly guide the subspace learning procedure. In addition, the large quan-tities of unpaired samples (images or texts) may provide useful side information to enrich the representations from learned subspace. Thus, in this paper we propose a novel model for cross-modal retrieval problem. It consists of 1) a semi-supervised coupled dictionary learning step to generate homogeneously sparse representations for different modali-ties based on both paired and unpaired samples; 2) a coupled feature mapping step to project the sparse representations of different modalities into a common subspace defined by class label information to perform cross-modal matching. Exper-iments on a large scale web image dataset MIRFlickr-1M with both fully paired and unpaired settings show the effec-tiveness of the proposed model on the cross-modal retrieval task.

Original languageEnglish
Title of host publicationMM 2015 - Proceedings of the 2015 ACM Multimedia Conference
PublisherAssociation for Computing Machinery, Inc
Pages847-850
Number of pages4
ISBN (Electronic)9781450334594
DOIs
Publication statusPublished - Oct 13 2015
Event23rd ACM International Conference on Multimedia, MM 2015 - Brisbane, Australia
Duration: Oct 26 2015Oct 30 2015

Publication series

NameMM 2015 - Proceedings of the 2015 ACM Multimedia Conference

Other

Other23rd ACM International Conference on Multimedia, MM 2015
CountryAustralia
CityBrisbane
Period10/26/1510/30/15

All Science Journal Classification (ASJC) codes

  • Media Technology
  • Computer Graphics and Computer-Aided Design
  • Computer Vision and Pattern Recognition
  • Software

Fingerprint Dive into the research topics of 'Semi-supervised coupled dictionary learning for cross-modal retrieval in internet images and texts'. Together they form a unique fingerprint.

  • Cite this

    Xu, X., Yang, Y., Shimada, A., Taniguchi, R. I., & He, L. (2015). Semi-supervised coupled dictionary learning for cross-modal retrieval in internet images and texts. In MM 2015 - Proceedings of the 2015 ACM Multimedia Conference (pp. 847-850). (MM 2015 - Proceedings of the 2015 ACM Multimedia Conference). Association for Computing Machinery, Inc. https://doi.org/10.1145/2733373.2806346