Sound source detection using multiple noise models

Shoichi Matsunaga, Masahide Yamaguchi, Katsuya Yamauchi, Masaru Yamashita

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

This paper describes a sound source detection approach based on elaborate noise-modeling techniques for audio indexing. For accurate detection, we devised two methods to generate multiple-noise models through clustering techniques. One method is based on frame-wise data similarity, and the other is based on noise source similarity. The former method employs K-means clustering and a smoothing technique to avoid inaccurate segmentation. The latter method involves noise modeling based on a tree data structure generated by the progressive merging of noise clusters. The classification experiments show that by using these proposed methods, audio sources can be detected with better accuracy than that achieved by a conventional method. When four noise models generated by the latter method were used, the noise detection performance increased by 3.9% for the periods in which the sound sources did not overlap. With regard to the experiments for an audio stream that included overlapped segments, the noise detection performance increased by 1.2% without a decrease in the speech detection performance.

Original languageEnglish
Title of host publication2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP
Pages2025-2028
Number of pages4
DOIs
Publication statusPublished - Sep 16 2008
Externally publishedYes
Event2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP - Las Vegas, NV, United States
Duration: Mar 31 2008Apr 4 2008

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Other

Other2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP
CountryUnited States
CityLas Vegas, NV
Period3/31/084/4/08

Fingerprint

Acoustic noise
Acoustic waves
acoustics
Merging
Data structures
Experiments
data structures
smoothing

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Matsunaga, S., Yamaguchi, M., Yamauchi, K., & Yamashita, M. (2008). Sound source detection using multiple noise models. In 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP (pp. 2025-2028). [4518037] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). https://doi.org/10.1109/ICASSP.2008.4518037

Sound source detection using multiple noise models. / Matsunaga, Shoichi; Yamaguchi, Masahide; Yamauchi, Katsuya; Yamashita, Masaru.

2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP. 2008. p. 2025-2028 4518037 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Matsunaga, S, Yamaguchi, M, Yamauchi, K & Yamashita, M 2008, Sound source detection using multiple noise models. in 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP., 4518037, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, pp. 2025-2028, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP, Las Vegas, NV, United States, 3/31/08. https://doi.org/10.1109/ICASSP.2008.4518037
Matsunaga S, Yamaguchi M, Yamauchi K, Yamashita M. Sound source detection using multiple noise models. In 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP. 2008. p. 2025-2028. 4518037. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). https://doi.org/10.1109/ICASSP.2008.4518037
Matsunaga, Shoichi ; Yamaguchi, Masahide ; Yamauchi, Katsuya ; Yamashita, Masaru. / Sound source detection using multiple noise models. 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP. 2008. pp. 2025-2028 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).
@inproceedings{805d6da22941468490d1e20560d02b57,
title = "Sound source detection using multiple noise models",
abstract = "This paper describes a sound source detection approach based on elaborate noise-modeling techniques for audio indexing. For accurate detection, we devised two methods to generate multiple-noise models through clustering techniques. One method is based on frame-wise data similarity, and the other is based on noise source similarity. The former method employs K-means clustering and a smoothing technique to avoid inaccurate segmentation. The latter method involves noise modeling based on a tree data structure generated by the progressive merging of noise clusters. The classification experiments show that by using these proposed methods, audio sources can be detected with better accuracy than that achieved by a conventional method. When four noise models generated by the latter method were used, the noise detection performance increased by 3.9{\%} for the periods in which the sound sources did not overlap. With regard to the experiments for an audio stream that included overlapped segments, the noise detection performance increased by 1.2{\%} without a decrease in the speech detection performance.",
author = "Shoichi Matsunaga and Masahide Yamaguchi and Katsuya Yamauchi and Masaru Yamashita",
year = "2008",
month = "9",
day = "16",
doi = "10.1109/ICASSP.2008.4518037",
language = "English",
isbn = "1424414849",
series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
pages = "2025--2028",
booktitle = "2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP",

}

TY - GEN

T1 - Sound source detection using multiple noise models

AU - Matsunaga, Shoichi

AU - Yamaguchi, Masahide

AU - Yamauchi, Katsuya

AU - Yamashita, Masaru

PY - 2008/9/16

Y1 - 2008/9/16

N2 - This paper describes a sound source detection approach based on elaborate noise-modeling techniques for audio indexing. For accurate detection, we devised two methods to generate multiple-noise models through clustering techniques. One method is based on frame-wise data similarity, and the other is based on noise source similarity. The former method employs K-means clustering and a smoothing technique to avoid inaccurate segmentation. The latter method involves noise modeling based on a tree data structure generated by the progressive merging of noise clusters. The classification experiments show that by using these proposed methods, audio sources can be detected with better accuracy than that achieved by a conventional method. When four noise models generated by the latter method were used, the noise detection performance increased by 3.9% for the periods in which the sound sources did not overlap. With regard to the experiments for an audio stream that included overlapped segments, the noise detection performance increased by 1.2% without a decrease in the speech detection performance.

AB - This paper describes a sound source detection approach based on elaborate noise-modeling techniques for audio indexing. For accurate detection, we devised two methods to generate multiple-noise models through clustering techniques. One method is based on frame-wise data similarity, and the other is based on noise source similarity. The former method employs K-means clustering and a smoothing technique to avoid inaccurate segmentation. The latter method involves noise modeling based on a tree data structure generated by the progressive merging of noise clusters. The classification experiments show that by using these proposed methods, audio sources can be detected with better accuracy than that achieved by a conventional method. When four noise models generated by the latter method were used, the noise detection performance increased by 3.9% for the periods in which the sound sources did not overlap. With regard to the experiments for an audio stream that included overlapped segments, the noise detection performance increased by 1.2% without a decrease in the speech detection performance.

UR - http://www.scopus.com/inward/record.url?scp=51449100759&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=51449100759&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2008.4518037

DO - 10.1109/ICASSP.2008.4518037

M3 - Conference contribution

AN - SCOPUS:51449100759

SN - 1424414849

SN - 9781424414840

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 2025

EP - 2028

BT - 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP

ER -