Classification of speaking proficiency level by machine learning and feature selection

Brendan Flanagan, Sachio Hirokawa, Emiko Kaneko, Emi Izumi

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Analysis of publicly available language learning corpora can be useful for extracting characteristic features of learners from different proficiency levels. This can then be used to support language learning research and the creation of educational resources. In this paper, we classify the words and parts of speech of transcripts from different speaking proficiency levels found in the NICT-JLE corpus. The characteristic features of learners who have the equivalent spoken proficiency of CEFR levels A1 through to B2 were extracted by analyzing the data with the support vector machine method. In particular, we apply feature selection to find a set of characteristic features that achieve optimal classification performance, which can be used to predict spoken learner proficiency.

    Original languageEnglish
    Title of host publicationEmerging Technologies for Education - 1st International Symposium, SETE 2016 Held in Conjunction with ICWL 2016, Revised Selected Papers
    EditorsRosella Gennari, Yiwei Cao, Yueh-Min Huang, Wu Wu, Haoran Xie
    PublisherSpringer Verlag
    Pages677-682
    Number of pages6
    ISBN (Print)9783319528359
    DOIs
    Publication statusPublished - 2017
    Event1st International Symposium on Emerging Technologies for Education, SETE 2016 Held in Conjunction with ICWL 2016 - Rome, Italy
    Duration: Oct 26 2016Oct 29 2016

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume10108 LNCS
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Other

    Other1st International Symposium on Emerging Technologies for Education, SETE 2016 Held in Conjunction with ICWL 2016
    Country/TerritoryItaly
    CityRome
    Period10/26/1610/29/16

    All Science Journal Classification (ASJC) codes

    • Theoretical Computer Science
    • Computer Science(all)

    Fingerprint

    Dive into the research topics of 'Classification of speaking proficiency level by machine learning and feature selection'. Together they form a unique fingerprint.

    Cite this