Supervised machine learning-based classification of oral malodor based on the microbiota in saliva samples

Yoshio Nakano, Toru Takeshita, Noriaki Kamio, Susumu Shiota, Yukie Shibata, Nao Suzuki, Masahiro Yoneda, Takao Hirofuji, Yoshihisa Yamashita

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

Objective: This study presents an effective method of classifying oral malodor from oral microbiota in saliva by using a support vector machine (SVM), an artificial neural network (ANN), and a decision tree. This approach uses concentrations of methyl mercaptan in mouth air as an indicator of oral malodor, and peak areas of terminal restriction fragment (T-RF) length polymorphisms (T-RFLPs) of the 16S rRNA gene as data for supervised machine-learning methods, without identifying specific species producing oral malodorous compounds. Methods: 16S rRNA genes were amplified from saliva samples from 309 subjects, and T-RFLP analysis was carried out with the DNA fragments. T-RFLP analysis provides information on microbiota consisting of fragment lengths and peak areas corresponding to bacterial strains. The peak area is equivalent to the frequency of a specific fragment when one molecule is selected from terminal fragments. Another frequency is obtained by dividing the number of species-containing samples by the total number of samples. An SVM, an ANN, and a decision tree were trained based on these two frequencies in 308 samples and classified the presence or absence of methyl mercaptan in mouth air from the remaining subject. Results: The proportion that trained SVM expressed as entropy achieved the highest classification accuracy, with a sensitivity of 51.1% and specificity of 95.0%. The ANN and decision tree provided lower classification accuracies, and only classification by the ANN was improved by weighting with entropy from the frequency of appearance in samples, which increased the accuracy to 81.9% with a sensitivity of 60.2% and a specificity of 90.5%. The decision tree showed low classification accuracy under all conditions. Conclusions: Using T-RF proportions and frequencies, models to classify the presence of methyl mercaptan, a volatile sulfur-containing compound that causes oral malodor, were developed. SVM classifiers successfully classified the presence of methyl mercaptan with high specificity, and this classification is expected to be useful for screening saliva for oral malodor before visits to specialist clinics. Classification by a SVM and an ANN does not require the identification of the oral microbiota species responsible for the malodor, and the ANN also does not require the proportions of T-RFs.

Original languageEnglish
Pages (from-to)97-101
Number of pages5
JournalArtificial Intelligence in Medicine
Volume60
Issue number2
DOIs
Publication statusPublished - Feb 1 2014

Fingerprint

Microbiota
Saliva
Learning systems
Decision Trees
Support vector machines
Decision trees
Neural networks
Sulfhydryl Compounds
Polymorphism
Entropy
rRNA Genes
Mouth
Genes
Air
Sulfur Compounds
Information analysis
Restriction Fragment Length Polymorphisms
Supervised Machine Learning
Screening
DNA

All Science Journal Classification (ASJC) codes

  • Medicine (miscellaneous)
  • Artificial Intelligence

Cite this

Supervised machine learning-based classification of oral malodor based on the microbiota in saliva samples. / Nakano, Yoshio; Takeshita, Toru; Kamio, Noriaki; Shiota, Susumu; Shibata, Yukie; Suzuki, Nao; Yoneda, Masahiro; Hirofuji, Takao; Yamashita, Yoshihisa.

In: Artificial Intelligence in Medicine, Vol. 60, No. 2, 01.02.2014, p. 97-101.

Research output: Contribution to journalArticle

Nakano, Yoshio ; Takeshita, Toru ; Kamio, Noriaki ; Shiota, Susumu ; Shibata, Yukie ; Suzuki, Nao ; Yoneda, Masahiro ; Hirofuji, Takao ; Yamashita, Yoshihisa. / Supervised machine learning-based classification of oral malodor based on the microbiota in saliva samples. In: Artificial Intelligence in Medicine. 2014 ; Vol. 60, No. 2. pp. 97-101.
@article{cfd3ac499b0c47f7b4e3a58005a79e5a,
title = "Supervised machine learning-based classification of oral malodor based on the microbiota in saliva samples",
abstract = "Objective: This study presents an effective method of classifying oral malodor from oral microbiota in saliva by using a support vector machine (SVM), an artificial neural network (ANN), and a decision tree. This approach uses concentrations of methyl mercaptan in mouth air as an indicator of oral malodor, and peak areas of terminal restriction fragment (T-RF) length polymorphisms (T-RFLPs) of the 16S rRNA gene as data for supervised machine-learning methods, without identifying specific species producing oral malodorous compounds. Methods: 16S rRNA genes were amplified from saliva samples from 309 subjects, and T-RFLP analysis was carried out with the DNA fragments. T-RFLP analysis provides information on microbiota consisting of fragment lengths and peak areas corresponding to bacterial strains. The peak area is equivalent to the frequency of a specific fragment when one molecule is selected from terminal fragments. Another frequency is obtained by dividing the number of species-containing samples by the total number of samples. An SVM, an ANN, and a decision tree were trained based on these two frequencies in 308 samples and classified the presence or absence of methyl mercaptan in mouth air from the remaining subject. Results: The proportion that trained SVM expressed as entropy achieved the highest classification accuracy, with a sensitivity of 51.1{\%} and specificity of 95.0{\%}. The ANN and decision tree provided lower classification accuracies, and only classification by the ANN was improved by weighting with entropy from the frequency of appearance in samples, which increased the accuracy to 81.9{\%} with a sensitivity of 60.2{\%} and a specificity of 90.5{\%}. The decision tree showed low classification accuracy under all conditions. Conclusions: Using T-RF proportions and frequencies, models to classify the presence of methyl mercaptan, a volatile sulfur-containing compound that causes oral malodor, were developed. SVM classifiers successfully classified the presence of methyl mercaptan with high specificity, and this classification is expected to be useful for screening saliva for oral malodor before visits to specialist clinics. Classification by a SVM and an ANN does not require the identification of the oral microbiota species responsible for the malodor, and the ANN also does not require the proportions of T-RFs.",
author = "Yoshio Nakano and Toru Takeshita and Noriaki Kamio and Susumu Shiota and Yukie Shibata and Nao Suzuki and Masahiro Yoneda and Takao Hirofuji and Yoshihisa Yamashita",
year = "2014",
month = "2",
day = "1",
doi = "10.1016/j.artmed.2013.12.001",
language = "English",
volume = "60",
pages = "97--101",
journal = "Artificial Intelligence in Medicine",
issn = "0933-3657",
publisher = "Elsevier",
number = "2",

}

TY - JOUR

T1 - Supervised machine learning-based classification of oral malodor based on the microbiota in saliva samples

AU - Nakano, Yoshio

AU - Takeshita, Toru

AU - Kamio, Noriaki

AU - Shiota, Susumu

AU - Shibata, Yukie

AU - Suzuki, Nao

AU - Yoneda, Masahiro

AU - Hirofuji, Takao

AU - Yamashita, Yoshihisa

PY - 2014/2/1

Y1 - 2014/2/1

N2 - Objective: This study presents an effective method of classifying oral malodor from oral microbiota in saliva by using a support vector machine (SVM), an artificial neural network (ANN), and a decision tree. This approach uses concentrations of methyl mercaptan in mouth air as an indicator of oral malodor, and peak areas of terminal restriction fragment (T-RF) length polymorphisms (T-RFLPs) of the 16S rRNA gene as data for supervised machine-learning methods, without identifying specific species producing oral malodorous compounds. Methods: 16S rRNA genes were amplified from saliva samples from 309 subjects, and T-RFLP analysis was carried out with the DNA fragments. T-RFLP analysis provides information on microbiota consisting of fragment lengths and peak areas corresponding to bacterial strains. The peak area is equivalent to the frequency of a specific fragment when one molecule is selected from terminal fragments. Another frequency is obtained by dividing the number of species-containing samples by the total number of samples. An SVM, an ANN, and a decision tree were trained based on these two frequencies in 308 samples and classified the presence or absence of methyl mercaptan in mouth air from the remaining subject. Results: The proportion that trained SVM expressed as entropy achieved the highest classification accuracy, with a sensitivity of 51.1% and specificity of 95.0%. The ANN and decision tree provided lower classification accuracies, and only classification by the ANN was improved by weighting with entropy from the frequency of appearance in samples, which increased the accuracy to 81.9% with a sensitivity of 60.2% and a specificity of 90.5%. The decision tree showed low classification accuracy under all conditions. Conclusions: Using T-RF proportions and frequencies, models to classify the presence of methyl mercaptan, a volatile sulfur-containing compound that causes oral malodor, were developed. SVM classifiers successfully classified the presence of methyl mercaptan with high specificity, and this classification is expected to be useful for screening saliva for oral malodor before visits to specialist clinics. Classification by a SVM and an ANN does not require the identification of the oral microbiota species responsible for the malodor, and the ANN also does not require the proportions of T-RFs.

AB - Objective: This study presents an effective method of classifying oral malodor from oral microbiota in saliva by using a support vector machine (SVM), an artificial neural network (ANN), and a decision tree. This approach uses concentrations of methyl mercaptan in mouth air as an indicator of oral malodor, and peak areas of terminal restriction fragment (T-RF) length polymorphisms (T-RFLPs) of the 16S rRNA gene as data for supervised machine-learning methods, without identifying specific species producing oral malodorous compounds. Methods: 16S rRNA genes were amplified from saliva samples from 309 subjects, and T-RFLP analysis was carried out with the DNA fragments. T-RFLP analysis provides information on microbiota consisting of fragment lengths and peak areas corresponding to bacterial strains. The peak area is equivalent to the frequency of a specific fragment when one molecule is selected from terminal fragments. Another frequency is obtained by dividing the number of species-containing samples by the total number of samples. An SVM, an ANN, and a decision tree were trained based on these two frequencies in 308 samples and classified the presence or absence of methyl mercaptan in mouth air from the remaining subject. Results: The proportion that trained SVM expressed as entropy achieved the highest classification accuracy, with a sensitivity of 51.1% and specificity of 95.0%. The ANN and decision tree provided lower classification accuracies, and only classification by the ANN was improved by weighting with entropy from the frequency of appearance in samples, which increased the accuracy to 81.9% with a sensitivity of 60.2% and a specificity of 90.5%. The decision tree showed low classification accuracy under all conditions. Conclusions: Using T-RF proportions and frequencies, models to classify the presence of methyl mercaptan, a volatile sulfur-containing compound that causes oral malodor, were developed. SVM classifiers successfully classified the presence of methyl mercaptan with high specificity, and this classification is expected to be useful for screening saliva for oral malodor before visits to specialist clinics. Classification by a SVM and an ANN does not require the identification of the oral microbiota species responsible for the malodor, and the ANN also does not require the proportions of T-RFs.

UR - http://www.scopus.com/inward/record.url?scp=84893743957&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84893743957&partnerID=8YFLogxK

U2 - 10.1016/j.artmed.2013.12.001

DO - 10.1016/j.artmed.2013.12.001

M3 - Article

C2 - 24439218

AN - SCOPUS:84893743957

VL - 60

SP - 97

EP - 101

JO - Artificial Intelligence in Medicine

JF - Artificial Intelligence in Medicine

SN - 0933-3657

IS - 2

ER -