Classification of speaking proficiency level by machine learning and feature selection

Brendan Flanagan, Sachio Hirokawa, Emiko Kaneko, Emi Izumi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Analysis of publicly available language learning corpora can be useful for extracting characteristic features of learners from different proficiency levels. This can then be used to support language learning research and the creation of educational resources. In this paper, we classify the words and parts of speech of transcripts from different speaking proficiency levels found in the NICT-JLE corpus. The characteristic features of learners who have the equivalent spoken proficiency of CEFR levels A1 through to B2 were extracted by analyzing the data with the support vector machine method. In particular, we apply feature selection to find a set of characteristic features that achieve optimal classification performance, which can be used to predict spoken learner proficiency.

Original languageEnglish
Title of host publicationEmerging Technologies for Education - 1st International Symposium, SETE 2016 Held in Conjunction with ICWL 2016, Revised Selected Papers
EditorsRosella Gennari, Yiwei Cao, Yueh-Min Huang, Wu Wu, Haoran Xie
PublisherSpringer Verlag
Pages677-682
Number of pages6
ISBN (Print)9783319528359
DOIs
Publication statusPublished - Jan 1 2017
Event1st International Symposium on Emerging Technologies for Education, SETE 2016 Held in Conjunction with ICWL 2016 - Rome, Italy
Duration: Oct 26 2016Oct 29 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10108 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other1st International Symposium on Emerging Technologies for Education, SETE 2016 Held in Conjunction with ICWL 2016
CountryItaly
CityRome
Period10/26/1610/29/16

    Fingerprint

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Flanagan, B., Hirokawa, S., Kaneko, E., & Izumi, E. (2017). Classification of speaking proficiency level by machine learning and feature selection. In R. Gennari, Y. Cao, Y-M. Huang, W. Wu, & H. Xie (Eds.), Emerging Technologies for Education - 1st International Symposium, SETE 2016 Held in Conjunction with ICWL 2016, Revised Selected Papers (pp. 677-682). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10108 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-319-52836-6_72