Inverse analysis of vocal sound source using an analytical model of the vocal tract

Kazuya Yokota, Satoshi Ishikawa, Yosuke Koba, Shinya Kijimoto, Shohei Sugiki

Research output: Contribution to journalArticle

Abstract

Diseases occurring near the vocal cords, such as laryngeal cancer, share the initial symptom of hoarseness of voice. The GRBAS (grade, roughness, breathiness, asthenia, strain) scale is used as an acoustic diagnostic method for these diseases, but its objectivity is not well established. Instead, more accurate diagnosis may be possible by capturing the waveform of the volume velocity at the vocal cords. The aim of this study is to enable voice disturbances to be diagnosed by identifying the sound-source waveform from voice measurements. For acoustic analysis of the vocal tract, we modeled the air inside it as concentrated masses connected by linear springs and dampers. We identified the shape of the vocal tract by making the natural frequencies of the analytical model correspond to the measured formant frequencies, and we calculated the sound-source waveform from the measured voice waveform. To assess the validity of the model, we measured actual voices and used the model to identify the vocal tract shapes and corresponding sound-source waveforms. The identified waveforms have an asymmetrical triangular form, which is a feature of actual human sound-source waveforms. Local solutions allow multiple vocal tract shapes to be identified from a single sample. However, mathematical analysis showed that these differ only in the amplitude of the sound-source waveform, which does not affect the waveform shape. Furthermore, we built an experimental device that simulates the human voice mechanism and comprises an acrylic vocal tract and a piston. We confirmed that the identified sound sources are similar to measured sound sources. We therefore conclude that our proposed methods are valid.

Original languageEnglish
Pages (from-to)89-103
Number of pages15
JournalApplied Acoustics
Volume150
DOIs
Publication statusPublished - Jul 1 2019

Fingerprint

waveforms
acoustics
vocal cords
applications of mathematics
dampers
pistons
resonant frequencies
grade
disturbances
roughness
cancer
air

All Science Journal Classification (ASJC) codes

  • Acoustics and Ultrasonics

Cite this

Inverse analysis of vocal sound source using an analytical model of the vocal tract. / Yokota, Kazuya; Ishikawa, Satoshi; Koba, Yosuke; Kijimoto, Shinya; Sugiki, Shohei.

In: Applied Acoustics, Vol. 150, 01.07.2019, p. 89-103.

Research output: Contribution to journalArticle

@article{36032ca0eba140d29839a30368c357a6,
title = "Inverse analysis of vocal sound source using an analytical model of the vocal tract",
abstract = "Diseases occurring near the vocal cords, such as laryngeal cancer, share the initial symptom of hoarseness of voice. The GRBAS (grade, roughness, breathiness, asthenia, strain) scale is used as an acoustic diagnostic method for these diseases, but its objectivity is not well established. Instead, more accurate diagnosis may be possible by capturing the waveform of the volume velocity at the vocal cords. The aim of this study is to enable voice disturbances to be diagnosed by identifying the sound-source waveform from voice measurements. For acoustic analysis of the vocal tract, we modeled the air inside it as concentrated masses connected by linear springs and dampers. We identified the shape of the vocal tract by making the natural frequencies of the analytical model correspond to the measured formant frequencies, and we calculated the sound-source waveform from the measured voice waveform. To assess the validity of the model, we measured actual voices and used the model to identify the vocal tract shapes and corresponding sound-source waveforms. The identified waveforms have an asymmetrical triangular form, which is a feature of actual human sound-source waveforms. Local solutions allow multiple vocal tract shapes to be identified from a single sample. However, mathematical analysis showed that these differ only in the amplitude of the sound-source waveform, which does not affect the waveform shape. Furthermore, we built an experimental device that simulates the human voice mechanism and comprises an acrylic vocal tract and a piston. We confirmed that the identified sound sources are similar to measured sound sources. We therefore conclude that our proposed methods are valid.",
author = "Kazuya Yokota and Satoshi Ishikawa and Yosuke Koba and Shinya Kijimoto and Shohei Sugiki",
year = "2019",
month = "7",
day = "1",
doi = "10.1016/j.apacoust.2019.02.005",
language = "English",
volume = "150",
pages = "89--103",
journal = "Applied Acoustics",
issn = "0003-682X",
publisher = "Elsevier Limited",

}

TY - JOUR

T1 - Inverse analysis of vocal sound source using an analytical model of the vocal tract

AU - Yokota, Kazuya

AU - Ishikawa, Satoshi

AU - Koba, Yosuke

AU - Kijimoto, Shinya

AU - Sugiki, Shohei

PY - 2019/7/1

Y1 - 2019/7/1

N2 - Diseases occurring near the vocal cords, such as laryngeal cancer, share the initial symptom of hoarseness of voice. The GRBAS (grade, roughness, breathiness, asthenia, strain) scale is used as an acoustic diagnostic method for these diseases, but its objectivity is not well established. Instead, more accurate diagnosis may be possible by capturing the waveform of the volume velocity at the vocal cords. The aim of this study is to enable voice disturbances to be diagnosed by identifying the sound-source waveform from voice measurements. For acoustic analysis of the vocal tract, we modeled the air inside it as concentrated masses connected by linear springs and dampers. We identified the shape of the vocal tract by making the natural frequencies of the analytical model correspond to the measured formant frequencies, and we calculated the sound-source waveform from the measured voice waveform. To assess the validity of the model, we measured actual voices and used the model to identify the vocal tract shapes and corresponding sound-source waveforms. The identified waveforms have an asymmetrical triangular form, which is a feature of actual human sound-source waveforms. Local solutions allow multiple vocal tract shapes to be identified from a single sample. However, mathematical analysis showed that these differ only in the amplitude of the sound-source waveform, which does not affect the waveform shape. Furthermore, we built an experimental device that simulates the human voice mechanism and comprises an acrylic vocal tract and a piston. We confirmed that the identified sound sources are similar to measured sound sources. We therefore conclude that our proposed methods are valid.

AB - Diseases occurring near the vocal cords, such as laryngeal cancer, share the initial symptom of hoarseness of voice. The GRBAS (grade, roughness, breathiness, asthenia, strain) scale is used as an acoustic diagnostic method for these diseases, but its objectivity is not well established. Instead, more accurate diagnosis may be possible by capturing the waveform of the volume velocity at the vocal cords. The aim of this study is to enable voice disturbances to be diagnosed by identifying the sound-source waveform from voice measurements. For acoustic analysis of the vocal tract, we modeled the air inside it as concentrated masses connected by linear springs and dampers. We identified the shape of the vocal tract by making the natural frequencies of the analytical model correspond to the measured formant frequencies, and we calculated the sound-source waveform from the measured voice waveform. To assess the validity of the model, we measured actual voices and used the model to identify the vocal tract shapes and corresponding sound-source waveforms. The identified waveforms have an asymmetrical triangular form, which is a feature of actual human sound-source waveforms. Local solutions allow multiple vocal tract shapes to be identified from a single sample. However, mathematical analysis showed that these differ only in the amplitude of the sound-source waveform, which does not affect the waveform shape. Furthermore, we built an experimental device that simulates the human voice mechanism and comprises an acrylic vocal tract and a piston. We confirmed that the identified sound sources are similar to measured sound sources. We therefore conclude that our proposed methods are valid.

UR - http://www.scopus.com/inward/record.url?scp=85061403884&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85061403884&partnerID=8YFLogxK

U2 - 10.1016/j.apacoust.2019.02.005

DO - 10.1016/j.apacoust.2019.02.005

M3 - Article

AN - SCOPUS:85061403884

VL - 150

SP - 89

EP - 103

JO - Applied Acoustics

JF - Applied Acoustics

SN - 0003-682X

ER -