While it is already possible to construct a large-scale English corpus easily and inexpensively using web documents, such documents have problems of reliability with respect to their English quality. Thus, we have developed a system that automatically gathers English technical papers from the web, sorts them into those written by native speakers and those written by non-native speakers, and then uses them to construct a native speaker and non-native speaker corpus. We discuss a method of using the corpus for providing alternative candidates for appropriate adjectives against unnatural English adjective-noun co-occurrence constructions 〈a,n〉. The appropriateness of adjective a' is evaluated based on the similarity of the occurring environments between a and a'.
|ジャーナル||Procedia - Social and Behavioral Sciences|
|出版ステータス||出版済み - 2011|
|イベント||Conference on Pacific Association for Computational Linguistics, PACLING 2011 - Kuala Lumpur, マレーシア|
継続期間: 7 19 2011 → 7 21 2011
All Science Journal Classification (ASJC) codes
- Social Sciences(all)