Classification and clustering English writing errors based on native language

Brendan Flanagan, Chengjiu Yin, Takahiko Suzuki, Sachio Hirokawa

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    3 Citations (Scopus)

    Abstract

    It is important for language learners to determine and reflect on their writing errors in order to overcome weaknesses. Each language learner has their own unique writing error characteristics and therefore has different learning needs. In this paper, we analyze the writing errors of foreign language learners on the language learning SNS website Lang-8 to investigate the characteristics of errors by native language. 142,465 sentences were collected from Lang-8 for analysis. For each native language, the predicted scores of 15 error categories from SVM machine learning models are used as a vector representation of each sentence. These score vectors are then clustered to determine error co-occurrence within the same sentence. The results were then analyzed to determine the error characteristics of different native languages.

    Original languageEnglish
    Title of host publicationProceedings - 2014 IIAI 3rd International Conference on Advanced Applied Informatics, IIAI-AAI 2014
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages318-323
    Number of pages6
    ISBN (Electronic)9781479941735
    DOIs
    Publication statusPublished - Sep 29 2014
    Event3rd IIAI International Conference on Advanced Applied Informatics, IIAI-AAI 2014 - Kitakyushu, Japan
    Duration: Aug 31 2014Sep 4 2014

    Publication series

    NameProceedings - 2014 IIAI 3rd International Conference on Advanced Applied Informatics, IIAI-AAI 2014

    Other

    Other3rd IIAI International Conference on Advanced Applied Informatics, IIAI-AAI 2014
    Country/TerritoryJapan
    CityKitakyushu
    Period8/31/149/4/14

    All Science Journal Classification (ASJC) codes

    • Information Systems

    Fingerprint

    Dive into the research topics of 'Classification and clustering English writing errors based on native language'. Together they form a unique fingerprint.

    Cite this