Feature words that classify problem sentence in scientific article

Toshihiko Sakai, Sachio Hirokawa

Research output: Chapter in Book/Report/Conference proceedingConference contribution

27 Citations (Scopus)

Abstract

Literature review requires understanding the contents from several view points, such as the problem and the method that the articles describe. Search from these viewpoints will improve the efficiency of survey, if particular segments of articles were extracted, indexed and can be used as auxiliary query. This paper focuses on sentences that describe the problem in an abstract and the feature sets that classify such problem sentences. Classification performance are evaluated by 10-fold cross-validation for six candidate sets of feature words. It turned out that the set of all words gains the best performance if 90% of the data are used as training data. However, the set of a small number of words with positive scores outperforms other feature sets, if the training data is only 10%. In such a realistic situation, the feature words are effective in improving classification performance.

Original languageEnglish
Title of host publication14th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2012 - Proceedings
Pages360-367
Number of pages8
DOIs
Publication statusPublished - 2012
Event14th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2012 - Bali, Indonesia
Duration: Dec 3 2012Dec 5 2012

Other

Other14th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2012
CountryIndonesia
CityBali
Period12/3/1212/5/12

All Science Journal Classification (ASJC) codes

  • Human-Computer Interaction
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Software

Cite this

Sakai, T., & Hirokawa, S. (2012). Feature words that classify problem sentence in scientific article. In 14th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2012 - Proceedings (pp. 360-367) https://doi.org/10.1145/2428736.2428803

Feature words that classify problem sentence in scientific article. / Sakai, Toshihiko; Hirokawa, Sachio.

14th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2012 - Proceedings. 2012. p. 360-367.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Sakai, T & Hirokawa, S 2012, Feature words that classify problem sentence in scientific article. in 14th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2012 - Proceedings. pp. 360-367, 14th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2012, Bali, Indonesia, 12/3/12. https://doi.org/10.1145/2428736.2428803
Sakai T, Hirokawa S. Feature words that classify problem sentence in scientific article. In 14th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2012 - Proceedings. 2012. p. 360-367 https://doi.org/10.1145/2428736.2428803
Sakai, Toshihiko ; Hirokawa, Sachio. / Feature words that classify problem sentence in scientific article. 14th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2012 - Proceedings. 2012. pp. 360-367
@inproceedings{9d4049c7bf3847efb60c5bd26521b401,
title = "Feature words that classify problem sentence in scientific article",
abstract = "Literature review requires understanding the contents from several view points, such as the problem and the method that the articles describe. Search from these viewpoints will improve the efficiency of survey, if particular segments of articles were extracted, indexed and can be used as auxiliary query. This paper focuses on sentences that describe the problem in an abstract and the feature sets that classify such problem sentences. Classification performance are evaluated by 10-fold cross-validation for six candidate sets of feature words. It turned out that the set of all words gains the best performance if 90{\%} of the data are used as training data. However, the set of a small number of words with positive scores outperforms other feature sets, if the training data is only 10{\%}. In such a realistic situation, the feature words are effective in improving classification performance.",
author = "Toshihiko Sakai and Sachio Hirokawa",
year = "2012",
doi = "10.1145/2428736.2428803",
language = "English",
isbn = "9781450313063",
pages = "360--367",
booktitle = "14th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2012 - Proceedings",

}

TY - GEN

T1 - Feature words that classify problem sentence in scientific article

AU - Sakai, Toshihiko

AU - Hirokawa, Sachio

PY - 2012

Y1 - 2012

N2 - Literature review requires understanding the contents from several view points, such as the problem and the method that the articles describe. Search from these viewpoints will improve the efficiency of survey, if particular segments of articles were extracted, indexed and can be used as auxiliary query. This paper focuses on sentences that describe the problem in an abstract and the feature sets that classify such problem sentences. Classification performance are evaluated by 10-fold cross-validation for six candidate sets of feature words. It turned out that the set of all words gains the best performance if 90% of the data are used as training data. However, the set of a small number of words with positive scores outperforms other feature sets, if the training data is only 10%. In such a realistic situation, the feature words are effective in improving classification performance.

AB - Literature review requires understanding the contents from several view points, such as the problem and the method that the articles describe. Search from these viewpoints will improve the efficiency of survey, if particular segments of articles were extracted, indexed and can be used as auxiliary query. This paper focuses on sentences that describe the problem in an abstract and the feature sets that classify such problem sentences. Classification performance are evaluated by 10-fold cross-validation for six candidate sets of feature words. It turned out that the set of all words gains the best performance if 90% of the data are used as training data. However, the set of a small number of words with positive scores outperforms other feature sets, if the training data is only 10%. In such a realistic situation, the feature words are effective in improving classification performance.

UR - http://www.scopus.com/inward/record.url?scp=84873381932&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84873381932&partnerID=8YFLogxK

U2 - 10.1145/2428736.2428803

DO - 10.1145/2428736.2428803

M3 - Conference contribution

AN - SCOPUS:84873381932

SN - 9781450313063

SP - 360

EP - 367

BT - 14th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2012 - Proceedings

ER -