Standard measure and SVM measure for feature selection and their performance effect for text classification

Yusuke Adachi, Naoya Onimura, Takanori Yamashita, Sachio Hirokawa

研究成果: 著書/レポートタイプへの貢献会議での発言

5 引用 (Scopus)

抄録

This paper compares the prediction performance of document classification based on a variety of feature selection measures. Empirical experiments were conducted for the dataset re0 with 10 measures for feature selection and with SVM. It is confirmed that the feature selection based on the SVM-score proposed by Sakai and Hirokawa (2012) outper-forms the standard measures with small number of features. In fact, 100 words are enough to get the similar performance obtained with all words. The reason of good performance of this feature selection is that the SVM-score capture not only the characteristic words of positive samples but of negative samples as well.

元の言語英語
ホスト出版物のタイトル18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings
編集者Maria Indrawan-Santiago, Gabriele Anderst-Kotsis, Matthias Steinbauer, Ismail Khalil
出版者Association for Computing Machinery
ページ262-266
ページ数5
ISBN(電子版)9781450348072
DOI
出版物ステータス出版済み - 11 28 2016
イベント18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Singapore, シンガポール
継続期間: 11 28 201611 30 2016

その他

その他18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016
シンガポール
Singapore
期間11/28/1611/30/16

Fingerprint

Feature extraction
Experiments

All Science Journal Classification (ASJC) codes

  • Human-Computer Interaction
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Software

これを引用

Adachi, Y., Onimura, N., Yamashita, T., & Hirokawa, S. (2016). Standard measure and SVM measure for feature selection and their performance effect for text classification. : M. Indrawan-Santiago, G. Anderst-Kotsis, M. Steinbauer, & I. Khalil (版), 18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings (pp. 262-266). Association for Computing Machinery. https://doi.org/10.1145/3011141.3011190

Standard measure and SVM measure for feature selection and their performance effect for text classification. / Adachi, Yusuke; Onimura, Naoya; Yamashita, Takanori; Hirokawa, Sachio.

18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings. 版 / Maria Indrawan-Santiago; Gabriele Anderst-Kotsis; Matthias Steinbauer; Ismail Khalil. Association for Computing Machinery, 2016. p. 262-266.

研究成果: 著書/レポートタイプへの貢献会議での発言

Adachi, Y, Onimura, N, Yamashita, T & Hirokawa, S 2016, Standard measure and SVM measure for feature selection and their performance effect for text classification. : M Indrawan-Santiago, G Anderst-Kotsis, M Steinbauer & I Khalil (版), 18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings. Association for Computing Machinery, pp. 262-266, 18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016, Singapore, シンガポール, 11/28/16. https://doi.org/10.1145/3011141.3011190
Adachi Y, Onimura N, Yamashita T, Hirokawa S. Standard measure and SVM measure for feature selection and their performance effect for text classification. : Indrawan-Santiago M, Anderst-Kotsis G, Steinbauer M, Khalil I, 編集者, 18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings. Association for Computing Machinery. 2016. p. 262-266 https://doi.org/10.1145/3011141.3011190
Adachi, Yusuke ; Onimura, Naoya ; Yamashita, Takanori ; Hirokawa, Sachio. / Standard measure and SVM measure for feature selection and their performance effect for text classification. 18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings. 編集者 / Maria Indrawan-Santiago ; Gabriele Anderst-Kotsis ; Matthias Steinbauer ; Ismail Khalil. Association for Computing Machinery, 2016. pp. 262-266
@inproceedings{56e5494554ed436abfe031c7a10942cb,
title = "Standard measure and SVM measure for feature selection and their performance effect for text classification",
abstract = "This paper compares the prediction performance of document classification based on a variety of feature selection measures. Empirical experiments were conducted for the dataset re0 with 10 measures for feature selection and with SVM. It is confirmed that the feature selection based on the SVM-score proposed by Sakai and Hirokawa (2012) outper-forms the standard measures with small number of features. In fact, 100 words are enough to get the similar performance obtained with all words. The reason of good performance of this feature selection is that the SVM-score capture not only the characteristic words of positive samples but of negative samples as well.",
author = "Yusuke Adachi and Naoya Onimura and Takanori Yamashita and Sachio Hirokawa",
year = "2016",
month = "11",
day = "28",
doi = "10.1145/3011141.3011190",
language = "English",
pages = "262--266",
editor = "Maria Indrawan-Santiago and Gabriele Anderst-Kotsis and Matthias Steinbauer and Ismail Khalil",
booktitle = "18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings",
publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - Standard measure and SVM measure for feature selection and their performance effect for text classification

AU - Adachi, Yusuke

AU - Onimura, Naoya

AU - Yamashita, Takanori

AU - Hirokawa, Sachio

PY - 2016/11/28

Y1 - 2016/11/28

N2 - This paper compares the prediction performance of document classification based on a variety of feature selection measures. Empirical experiments were conducted for the dataset re0 with 10 measures for feature selection and with SVM. It is confirmed that the feature selection based on the SVM-score proposed by Sakai and Hirokawa (2012) outper-forms the standard measures with small number of features. In fact, 100 words are enough to get the similar performance obtained with all words. The reason of good performance of this feature selection is that the SVM-score capture not only the characteristic words of positive samples but of negative samples as well.

AB - This paper compares the prediction performance of document classification based on a variety of feature selection measures. Empirical experiments were conducted for the dataset re0 with 10 measures for feature selection and with SVM. It is confirmed that the feature selection based on the SVM-score proposed by Sakai and Hirokawa (2012) outper-forms the standard measures with small number of features. In fact, 100 words are enough to get the similar performance obtained with all words. The reason of good performance of this feature selection is that the SVM-score capture not only the characteristic words of positive samples but of negative samples as well.

UR - http://www.scopus.com/inward/record.url?scp=85014936596&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85014936596&partnerID=8YFLogxK

U2 - 10.1145/3011141.3011190

DO - 10.1145/3011141.3011190

M3 - Conference contribution

AN - SCOPUS:85014936596

SP - 262

EP - 266

BT - 18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings

A2 - Indrawan-Santiago, Maria

A2 - Anderst-Kotsis, Gabriele

A2 - Steinbauer, Matthias

A2 - Khalil, Ismail

PB - Association for Computing Machinery

ER -