学習による映像中の音源同定

池田 千廣, ヤオカイ フォン, 内田 誠一

研究成果: Contribution to journalArticle査読

抄録

Sound source detection in an image is a difficult inverse problem where the pixels belonging to the sound source area are to be estimated. The purpose of this paper is to consider an accurate sound source detection method by using machine learning framework. Specifically, the proposed method relies on an AdaBoost-based learning scheme for discriminating whether each pixel belongs to a sound source or not. The learning is done by training weak learners to discriminate positive samples (couples of image features around sound sources and audio features) and negative samples (couples of image features distant from sound sources and audio features). This learning scheme simply combines these multimodal information (i.e., image and audio) by using some weak learners to discriminate the samples by a single image feature and others by a single audio feature. The performance of this naive implementation based on a simple combination of multimodal information was examined experimentally and its essential problem was revealed with a possible remedy.
寄稿の翻訳タイトルSound Source Detection by Learning
本文言語日本語
ページ(範囲)93-98
ページ数6
ジャーナルIEICE technical report
110
188
出版ステータス出版済み - 8 29 2010

フィンガープリント

「学習による映像中の音源同定」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル