Co-occurrence-based clustering of odor descriptors for predicting structure-odor relationship

Chuanjun Liu, Liang Shang, Kenshi Hayashi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

One problem of machine-learning-based prediction of structure-odor relationship is that odorant molecules are usually labeled with ambiguous descriptors when they are collected from different sources. This study focused on the clustering of the odor descriptors by text mining approaches as well as the prediction of newly established labels from physicochemical parameters of the classified odorant molecules. An odor database was established by web scraping and transferred to a document-Term matrix including 4011 odorants and 100 odor descriptors. The clustering of the odor descriptors was carried out by using different co-occurrence matrix and clustering approaches. A hierarchical cluster analysis combined with a co-occurrence probability distribution matrix has shown good results in the descriptor clustering. The attribute labels of each class were established and then predicted from physicochemical parameters of the classified odorants by using random forest model. An average accuracy higher than 82.42% was obtained, indicating the effectiveness of the proposed approaches for predicting structure-odor relationship.

Original languageEnglish
Title of host publicationISOEN 2019 - 18th International Symposium on Olfaction and Electronic Nose, Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781538683279
DOIs
Publication statusPublished - May 2019
Event18th International Symposium on Olfaction and Electronic Nose, ISOEN 2019 - Fukuoka, Japan
Duration: May 26 2019May 29 2019

Publication series

NameISOEN 2019 - 18th International Symposium on Olfaction and Electronic Nose, Proceedings

Conference

Conference18th International Symposium on Olfaction and Electronic Nose, ISOEN 2019
CountryJapan
CityFukuoka
Period5/26/195/29/19

    Fingerprint

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Instrumentation

Cite this

Liu, C., Shang, L., & Hayashi, K. (2019). Co-occurrence-based clustering of odor descriptors for predicting structure-odor relationship. In ISOEN 2019 - 18th International Symposium on Olfaction and Electronic Nose, Proceedings [8823446] (ISOEN 2019 - 18th International Symposium on Olfaction and Electronic Nose, Proceedings). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ISOEN.2019.8823446