Standard measure and SVM measure for feature selection and their performance effect for text classification

Yusuke Adachi, Naoya Onimura, Takanori Yamashita, Sachio Hirokawa

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

This paper compares the prediction performance of document classification based on a variety of feature selection measures. Empirical experiments were conducted for the dataset re0 with 10 measures for feature selection and with SVM. It is confirmed that the feature selection based on the SVM-score proposed by Sakai and Hirokawa (2012) outper-forms the standard measures with small number of features. In fact, 100 words are enough to get the similar performance obtained with all words. The reason of good performance of this feature selection is that the SVM-score capture not only the characteristic words of positive samples but of negative samples as well.

Original languageEnglish
Title of host publication18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings
EditorsMaria Indrawan-Santiago, Gabriele Anderst-Kotsis, Matthias Steinbauer, Ismail Khalil
PublisherAssociation for Computing Machinery
Pages262-266
Number of pages5
ISBN (Electronic)9781450348072
DOIs
Publication statusPublished - Nov 28 2016
Event18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Singapore, Singapore
Duration: Nov 28 2016Nov 30 2016

Publication series

NameACM International Conference Proceeding Series

Other

Other18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016
CountrySingapore
CitySingapore
Period11/28/1611/30/16

Fingerprint

Feature extraction
Experiments

All Science Journal Classification (ASJC) codes

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Cite this

Adachi, Y., Onimura, N., Yamashita, T., & Hirokawa, S. (2016). Standard measure and SVM measure for feature selection and their performance effect for text classification. In M. Indrawan-Santiago, G. Anderst-Kotsis, M. Steinbauer, & I. Khalil (Eds.), 18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings (pp. 262-266). (ACM International Conference Proceeding Series). Association for Computing Machinery. https://doi.org/10.1145/3011141.3011190

Standard measure and SVM measure for feature selection and their performance effect for text classification. / Adachi, Yusuke; Onimura, Naoya; Yamashita, Takanori; Hirokawa, Sachio.

18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings. ed. / Maria Indrawan-Santiago; Gabriele Anderst-Kotsis; Matthias Steinbauer; Ismail Khalil. Association for Computing Machinery, 2016. p. 262-266 (ACM International Conference Proceeding Series).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Adachi, Y, Onimura, N, Yamashita, T & Hirokawa, S 2016, Standard measure and SVM measure for feature selection and their performance effect for text classification. in M Indrawan-Santiago, G Anderst-Kotsis, M Steinbauer & I Khalil (eds), 18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings. ACM International Conference Proceeding Series, Association for Computing Machinery, pp. 262-266, 18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016, Singapore, Singapore, 11/28/16. https://doi.org/10.1145/3011141.3011190
Adachi Y, Onimura N, Yamashita T, Hirokawa S. Standard measure and SVM measure for feature selection and their performance effect for text classification. In Indrawan-Santiago M, Anderst-Kotsis G, Steinbauer M, Khalil I, editors, 18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings. Association for Computing Machinery. 2016. p. 262-266. (ACM International Conference Proceeding Series). https://doi.org/10.1145/3011141.3011190
Adachi, Yusuke ; Onimura, Naoya ; Yamashita, Takanori ; Hirokawa, Sachio. / Standard measure and SVM measure for feature selection and their performance effect for text classification. 18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings. editor / Maria Indrawan-Santiago ; Gabriele Anderst-Kotsis ; Matthias Steinbauer ; Ismail Khalil. Association for Computing Machinery, 2016. pp. 262-266 (ACM International Conference Proceeding Series).
@inproceedings{56e5494554ed436abfe031c7a10942cb,
title = "Standard measure and SVM measure for feature selection and their performance effect for text classification",
abstract = "This paper compares the prediction performance of document classification based on a variety of feature selection measures. Empirical experiments were conducted for the dataset re0 with 10 measures for feature selection and with SVM. It is confirmed that the feature selection based on the SVM-score proposed by Sakai and Hirokawa (2012) outper-forms the standard measures with small number of features. In fact, 100 words are enough to get the similar performance obtained with all words. The reason of good performance of this feature selection is that the SVM-score capture not only the characteristic words of positive samples but of negative samples as well.",
author = "Yusuke Adachi and Naoya Onimura and Takanori Yamashita and Sachio Hirokawa",
year = "2016",
month = "11",
day = "28",
doi = "10.1145/3011141.3011190",
language = "English",
series = "ACM International Conference Proceeding Series",
publisher = "Association for Computing Machinery",
pages = "262--266",
editor = "Maria Indrawan-Santiago and Gabriele Anderst-Kotsis and Matthias Steinbauer and Ismail Khalil",
booktitle = "18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings",

}

TY - GEN

T1 - Standard measure and SVM measure for feature selection and their performance effect for text classification

AU - Adachi, Yusuke

AU - Onimura, Naoya

AU - Yamashita, Takanori

AU - Hirokawa, Sachio

PY - 2016/11/28

Y1 - 2016/11/28

N2 - This paper compares the prediction performance of document classification based on a variety of feature selection measures. Empirical experiments were conducted for the dataset re0 with 10 measures for feature selection and with SVM. It is confirmed that the feature selection based on the SVM-score proposed by Sakai and Hirokawa (2012) outper-forms the standard measures with small number of features. In fact, 100 words are enough to get the similar performance obtained with all words. The reason of good performance of this feature selection is that the SVM-score capture not only the characteristic words of positive samples but of negative samples as well.

AB - This paper compares the prediction performance of document classification based on a variety of feature selection measures. Empirical experiments were conducted for the dataset re0 with 10 measures for feature selection and with SVM. It is confirmed that the feature selection based on the SVM-score proposed by Sakai and Hirokawa (2012) outper-forms the standard measures with small number of features. In fact, 100 words are enough to get the similar performance obtained with all words. The reason of good performance of this feature selection is that the SVM-score capture not only the characteristic words of positive samples but of negative samples as well.

UR - http://www.scopus.com/inward/record.url?scp=85014936596&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85014936596&partnerID=8YFLogxK

U2 - 10.1145/3011141.3011190

DO - 10.1145/3011141.3011190

M3 - Conference contribution

AN - SCOPUS:85014936596

T3 - ACM International Conference Proceeding Series

SP - 262

EP - 266

BT - 18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings

A2 - Indrawan-Santiago, Maria

A2 - Anderst-Kotsis, Gabriele

A2 - Steinbauer, Matthias

A2 - Khalil, Ismail

PB - Association for Computing Machinery

ER -