Classification of imbalanced documents by feature selection

Yusuke Adachi, Naoya Onimura, Takanori Yamashita, Sachio Hirokawa

研究成果: 著書/レポートタイプへの貢献会議での発言

1 引用 (Scopus)

抄録

We previously worked on category classification problem of reuter's newspaper article using SVM and feature selection. In the study, feature selection by SVM-score [Sakai, Hirokawa, 2012] showed high accuracy. It was also expected to be superior to other standard indicators in case data is imbalanced. This study aimed to show the effectiveness of feature selection by SVM-score in machine learning with imbalanced data. For the reuter's data, F-measure was calculated in the classification experiment of all 13 categories. As a result, feature selection by SVM-score shows high f-measure and precision. In addition, we found feature words of negative example improve the classification performance.

元の言語英語
ホスト出版物のタイトルProceedings of 2017 International Conference on Compute and Data Analysis, ICCDA 2017
出版者Association for Computing Machinery
ページ228-232
ページ数5
Part F130280
ISBN(電子版)9781450352413
DOI
出版物ステータス出版済み - 5 19 2017
イベント2017 International Conference on Compute and Data Analysis, ICCDA 2017 - Lakeland, 米国
継続期間: 5 19 20175 23 2017

その他

その他2017 International Conference on Compute and Data Analysis, ICCDA 2017
米国
Lakeland
期間5/19/175/23/17

Fingerprint

Feature extraction
Learning systems
Experiments

All Science Journal Classification (ASJC) codes

  • Human-Computer Interaction
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Software

これを引用

Adachi, Y., Onimura, N., Yamashita, T., & Hirokawa, S. (2017). Classification of imbalanced documents by feature selection. : Proceedings of 2017 International Conference on Compute and Data Analysis, ICCDA 2017 (巻 Part F130280, pp. 228-232). Association for Computing Machinery. https://doi.org/10.1145/3093241.3093246

Classification of imbalanced documents by feature selection. / Adachi, Yusuke; Onimura, Naoya; Yamashita, Takanori; Hirokawa, Sachio.

Proceedings of 2017 International Conference on Compute and Data Analysis, ICCDA 2017. 巻 Part F130280 Association for Computing Machinery, 2017. p. 228-232.

研究成果: 著書/レポートタイプへの貢献会議での発言

Adachi, Y, Onimura, N, Yamashita, T & Hirokawa, S 2017, Classification of imbalanced documents by feature selection. : Proceedings of 2017 International Conference on Compute and Data Analysis, ICCDA 2017. 巻. Part F130280, Association for Computing Machinery, pp. 228-232, 2017 International Conference on Compute and Data Analysis, ICCDA 2017, Lakeland, 米国, 5/19/17. https://doi.org/10.1145/3093241.3093246
Adachi Y, Onimura N, Yamashita T, Hirokawa S. Classification of imbalanced documents by feature selection. : Proceedings of 2017 International Conference on Compute and Data Analysis, ICCDA 2017. 巻 Part F130280. Association for Computing Machinery. 2017. p. 228-232 https://doi.org/10.1145/3093241.3093246
Adachi, Yusuke ; Onimura, Naoya ; Yamashita, Takanori ; Hirokawa, Sachio. / Classification of imbalanced documents by feature selection. Proceedings of 2017 International Conference on Compute and Data Analysis, ICCDA 2017. 巻 Part F130280 Association for Computing Machinery, 2017. pp. 228-232
@inproceedings{827c6a7cbe714c73832642a669a3a622,
title = "Classification of imbalanced documents by feature selection",
abstract = "We previously worked on category classification problem of reuter's newspaper article using SVM and feature selection. In the study, feature selection by SVM-score [Sakai, Hirokawa, 2012] showed high accuracy. It was also expected to be superior to other standard indicators in case data is imbalanced. This study aimed to show the effectiveness of feature selection by SVM-score in machine learning with imbalanced data. For the reuter's data, F-measure was calculated in the classification experiment of all 13 categories. As a result, feature selection by SVM-score shows high f-measure and precision. In addition, we found feature words of negative example improve the classification performance.",
author = "Yusuke Adachi and Naoya Onimura and Takanori Yamashita and Sachio Hirokawa",
year = "2017",
month = "5",
day = "19",
doi = "10.1145/3093241.3093246",
language = "English",
volume = "Part F130280",
pages = "228--232",
booktitle = "Proceedings of 2017 International Conference on Compute and Data Analysis, ICCDA 2017",
publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - Classification of imbalanced documents by feature selection

AU - Adachi, Yusuke

AU - Onimura, Naoya

AU - Yamashita, Takanori

AU - Hirokawa, Sachio

PY - 2017/5/19

Y1 - 2017/5/19

N2 - We previously worked on category classification problem of reuter's newspaper article using SVM and feature selection. In the study, feature selection by SVM-score [Sakai, Hirokawa, 2012] showed high accuracy. It was also expected to be superior to other standard indicators in case data is imbalanced. This study aimed to show the effectiveness of feature selection by SVM-score in machine learning with imbalanced data. For the reuter's data, F-measure was calculated in the classification experiment of all 13 categories. As a result, feature selection by SVM-score shows high f-measure and precision. In addition, we found feature words of negative example improve the classification performance.

AB - We previously worked on category classification problem of reuter's newspaper article using SVM and feature selection. In the study, feature selection by SVM-score [Sakai, Hirokawa, 2012] showed high accuracy. It was also expected to be superior to other standard indicators in case data is imbalanced. This study aimed to show the effectiveness of feature selection by SVM-score in machine learning with imbalanced data. For the reuter's data, F-measure was calculated in the classification experiment of all 13 categories. As a result, feature selection by SVM-score shows high f-measure and precision. In addition, we found feature words of negative example improve the classification performance.

UR - http://www.scopus.com/inward/record.url?scp=85030120453&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85030120453&partnerID=8YFLogxK

U2 - 10.1145/3093241.3093246

DO - 10.1145/3093241.3093246

M3 - Conference contribution

VL - Part F130280

SP - 228

EP - 232

BT - Proceedings of 2017 International Conference on Compute and Data Analysis, ICCDA 2017

PB - Association for Computing Machinery

ER -