The effect of corpus size on case frame acquisition for predicate-argument structure analysis

Ryohei Sasano, Daisuke Kawahara, Sadao Kurohashi

研究成果: ジャーナルへの寄稿記事

2 引用 (Scopus)

抄録

This paper reports the effect of corpus size on case frame acquisition for predicate-argument structure analysis in Japanese. For this study, we collect a Japanese corpus consisting of up to 100 billion words, and construct case frames from corpora of six different sizes. Then, we apply these case frames to syntactic and case structure analysis, and zero anaphora resolution, in order to investigate the relationship between the corpus size for case frame acquisition and the performance of predicateargument structure analysis. We obtained better analyses by using case frames constructed from larger corpora; the performance was not saturated even with a corpus size of 100 billion words.

元の言語英語
ページ(範囲)1361-1368
ページ数8
ジャーナルIEICE Transactions on Information and Systems
E93-D
発行部数6
DOI
出版物ステータス出版済み - 6 2010

Fingerprint

Syntactics

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering
  • Artificial Intelligence

これを引用

The effect of corpus size on case frame acquisition for predicate-argument structure analysis. / Sasano, Ryohei; Kawahara, Daisuke; Kurohashi, Sadao.

:: IEICE Transactions on Information and Systems, 巻 E93-D, 番号 6, 06.2010, p. 1361-1368.

研究成果: ジャーナルへの寄稿記事

Sasano, Ryohei ; Kawahara, Daisuke ; Kurohashi, Sadao. / The effect of corpus size on case frame acquisition for predicate-argument structure analysis. :: IEICE Transactions on Information and Systems. 2010 ; 巻 E93-D, 番号 6. pp. 1361-1368.
@article{9371de12a91e4c51b89a4ba350cd8b55,
title = "The effect of corpus size on case frame acquisition for predicate-argument structure analysis",
abstract = "This paper reports the effect of corpus size on case frame acquisition for predicate-argument structure analysis in Japanese. For this study, we collect a Japanese corpus consisting of up to 100 billion words, and construct case frames from corpora of six different sizes. Then, we apply these case frames to syntactic and case structure analysis, and zero anaphora resolution, in order to investigate the relationship between the corpus size for case frame acquisition and the performance of predicateargument structure analysis. We obtained better analyses by using case frames constructed from larger corpora; the performance was not saturated even with a corpus size of 100 billion words.",
author = "Ryohei Sasano and Daisuke Kawahara and Sadao Kurohashi",
year = "2010",
month = "6",
doi = "10.1587/transinf.E93.D.1361",
language = "English",
volume = "E93-D",
pages = "1361--1368",
journal = "IEICE Transactions on Information and Systems",
issn = "0916-8532",
publisher = "一般社団法人電子情報通信学会",
number = "6",

}

TY - JOUR

T1 - The effect of corpus size on case frame acquisition for predicate-argument structure analysis

AU - Sasano, Ryohei

AU - Kawahara, Daisuke

AU - Kurohashi, Sadao

PY - 2010/6

Y1 - 2010/6

N2 - This paper reports the effect of corpus size on case frame acquisition for predicate-argument structure analysis in Japanese. For this study, we collect a Japanese corpus consisting of up to 100 billion words, and construct case frames from corpora of six different sizes. Then, we apply these case frames to syntactic and case structure analysis, and zero anaphora resolution, in order to investigate the relationship between the corpus size for case frame acquisition and the performance of predicateargument structure analysis. We obtained better analyses by using case frames constructed from larger corpora; the performance was not saturated even with a corpus size of 100 billion words.

AB - This paper reports the effect of corpus size on case frame acquisition for predicate-argument structure analysis in Japanese. For this study, we collect a Japanese corpus consisting of up to 100 billion words, and construct case frames from corpora of six different sizes. Then, we apply these case frames to syntactic and case structure analysis, and zero anaphora resolution, in order to investigate the relationship between the corpus size for case frame acquisition and the performance of predicateargument structure analysis. We obtained better analyses by using case frames constructed from larger corpora; the performance was not saturated even with a corpus size of 100 billion words.

UR - http://www.scopus.com/inward/record.url?scp=77952984545&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77952984545&partnerID=8YFLogxK

U2 - 10.1587/transinf.E93.D.1361

DO - 10.1587/transinf.E93.D.1361

M3 - Article

AN - SCOPUS:77952984545

VL - E93-D

SP - 1361

EP - 1368

JO - IEICE Transactions on Information and Systems

JF - IEICE Transactions on Information and Systems

SN - 0916-8532

IS - 6

ER -