Classification of imbalanced documents by feature selection

Yusuke Adachi, Naoya Onimura, Takanori Yamashita, Sachio Hirokawa

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

We previously worked on category classification problem of reuter's newspaper article using SVM and feature selection. In the study, feature selection by SVM-score [Sakai, Hirokawa, 2012] showed high accuracy. It was also expected to be superior to other standard indicators in case data is imbalanced. This study aimed to show the effectiveness of feature selection by SVM-score in machine learning with imbalanced data. For the reuter's data, F-measure was calculated in the classification experiment of all 13 categories. As a result, feature selection by SVM-score shows high f-measure and precision. In addition, we found feature words of negative example improve the classification performance.

Original languageEnglish
Title of host publicationProceedings of 2017 International Conference on Compute and Data Analysis, ICCDA 2017
PublisherAssociation for Computing Machinery
Pages228-232
Number of pages5
VolumePart F130280
ISBN (Electronic)9781450352413
DOIs
Publication statusPublished - May 19 2017
Event2017 International Conference on Compute and Data Analysis, ICCDA 2017 - Lakeland, United States
Duration: May 19 2017May 23 2017

Other

Other2017 International Conference on Compute and Data Analysis, ICCDA 2017
CountryUnited States
CityLakeland
Period5/19/175/23/17

All Science Journal Classification (ASJC) codes

  • Human-Computer Interaction
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Software

Fingerprint Dive into the research topics of 'Classification of imbalanced documents by feature selection'. Together they form a unique fingerprint.

Cite this