Coupled dictionary learning and feature mapping for cross-modal retrieval

Research output: Chapter in Book/Report/Conference proceedingConference contribution

19 Citations (Scopus)

Abstract

In this paper, we investigate the problem of modeling images and associated text for cross-modal retrieval tasks such as text-to-image search and image-to-text search. To make the data from image and text modalities comparable, previous cross-modal retrieval methods directly learn two projection matrices to map the raw features of the two modalities into a common subspace, in which cross-modal data matching can be performed. However, the different feature representations and correlation structures of different modalities inhibit these methods from efficiently modeling the relationships across modalities through a common subspace. To handle the diversities of different modalities, we first leverage the coupled dictionary learning method to generate homogeneous sparse representations for different modalities by associating and jointly updating their dictionaries. We then use a coupled feature mapping scheme to project the derived sparse representations from different modalities into a common subspace in which cross-modal retrieval can be performed. Experiments on a variety of cross-modal retrieval tasks demonstrate that the proposed method outperforms the state-of-the-art approaches.

Original languageEnglish
Title of host publication2015 IEEE International Conference on Multimedia and Expo, ICME 2015
PublisherIEEE Computer Society
Volume2015-August
ISBN (Electronic)9781479970827
DOIs
Publication statusPublished - Aug 4 2015
EventIEEE International Conference on Multimedia and Expo, ICME 2015 - Turin, Italy
Duration: Jun 29 2015Jul 3 2015

Other

OtherIEEE International Conference on Multimedia and Expo, ICME 2015
CountryItaly
CityTurin
Period6/29/157/3/15

    Fingerprint

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Computer Science Applications

Cite this

Xu, X., Shimada, A., Taniguchi, R-I., & He, L. (2015). Coupled dictionary learning and feature mapping for cross-modal retrieval. In 2015 IEEE International Conference on Multimedia and Expo, ICME 2015 (Vol. 2015-August). [7177396] IEEE Computer Society. https://doi.org/10.1109/ICME.2015.7177396