Using short dependency relations from auto-parsed data for Chinese dependency parsing

Wenliang Chen, Daisuke Kawahara, Kiyotaka Uchimoto, Yujie Zhang, Hitoshi Isahara

研究成果: ジャーナルへの寄稿記事

4 引用 (Scopus)

抄録

Dependency parsing has become increasingly popular for a surge of interest lately for applications such as machine translation and question answering. Currently, several supervised learning methods can be used for training high-performance dependency parsers if sufficient labeled data are available. However, currently used statistical dependency parsers provide poor results for words separated by long distances. In order to solve this problem, this article presents an effective dependency parsing approach of incorporating short dependency information from unlabeled data. The unlabeled data is automatically parsed by using a deterministic dependency parser, which exhibits a relatively high performance for short dependencies between words. We then train another parser that uses the information on short dependency relations extracted from the output of the first parser. The proposed approach achieves an unlabeled attachment score of 86.52%, an absolute 1.24% improvement over the baseline system on the Chinese Treebank data set. The results indicate that the proposed approach improves the parsing performance for longer distance words.

元の言語英語
記事番号10
ジャーナルACM Transactions on Asian Language Information Processing
8
発行部数3
DOI
出版物ステータス出版済み - 8 1 2009

Fingerprint

Supervised learning

All Science Journal Classification (ASJC) codes

  • Computer Science(all)

これを引用

Using short dependency relations from auto-parsed data for Chinese dependency parsing. / Chen, Wenliang; Kawahara, Daisuke; Uchimoto, Kiyotaka; Zhang, Yujie; Isahara, Hitoshi.

:: ACM Transactions on Asian Language Information Processing, 巻 8, 番号 3, 10, 01.08.2009.

研究成果: ジャーナルへの寄稿記事

Chen, Wenliang ; Kawahara, Daisuke ; Uchimoto, Kiyotaka ; Zhang, Yujie ; Isahara, Hitoshi. / Using short dependency relations from auto-parsed data for Chinese dependency parsing. :: ACM Transactions on Asian Language Information Processing. 2009 ; 巻 8, 番号 3.
@article{f96d948af7ad434b902da74d137efe8e,
title = "Using short dependency relations from auto-parsed data for Chinese dependency parsing",
abstract = "Dependency parsing has become increasingly popular for a surge of interest lately for applications such as machine translation and question answering. Currently, several supervised learning methods can be used for training high-performance dependency parsers if sufficient labeled data are available. However, currently used statistical dependency parsers provide poor results for words separated by long distances. In order to solve this problem, this article presents an effective dependency parsing approach of incorporating short dependency information from unlabeled data. The unlabeled data is automatically parsed by using a deterministic dependency parser, which exhibits a relatively high performance for short dependencies between words. We then train another parser that uses the information on short dependency relations extracted from the output of the first parser. The proposed approach achieves an unlabeled attachment score of 86.52{\%}, an absolute 1.24{\%} improvement over the baseline system on the Chinese Treebank data set. The results indicate that the proposed approach improves the parsing performance for longer distance words.",
author = "Wenliang Chen and Daisuke Kawahara and Kiyotaka Uchimoto and Yujie Zhang and Hitoshi Isahara",
year = "2009",
month = "8",
day = "1",
doi = "10.1145/1568292.1568293",
language = "English",
volume = "8",
journal = "ACM Transactions on Asian Language Information Processing",
issn = "1530-0226",
publisher = "Association for Computing Machinery (ACM)",
number = "3",

}

TY - JOUR

T1 - Using short dependency relations from auto-parsed data for Chinese dependency parsing

AU - Chen, Wenliang

AU - Kawahara, Daisuke

AU - Uchimoto, Kiyotaka

AU - Zhang, Yujie

AU - Isahara, Hitoshi

PY - 2009/8/1

Y1 - 2009/8/1

N2 - Dependency parsing has become increasingly popular for a surge of interest lately for applications such as machine translation and question answering. Currently, several supervised learning methods can be used for training high-performance dependency parsers if sufficient labeled data are available. However, currently used statistical dependency parsers provide poor results for words separated by long distances. In order to solve this problem, this article presents an effective dependency parsing approach of incorporating short dependency information from unlabeled data. The unlabeled data is automatically parsed by using a deterministic dependency parser, which exhibits a relatively high performance for short dependencies between words. We then train another parser that uses the information on short dependency relations extracted from the output of the first parser. The proposed approach achieves an unlabeled attachment score of 86.52%, an absolute 1.24% improvement over the baseline system on the Chinese Treebank data set. The results indicate that the proposed approach improves the parsing performance for longer distance words.

AB - Dependency parsing has become increasingly popular for a surge of interest lately for applications such as machine translation and question answering. Currently, several supervised learning methods can be used for training high-performance dependency parsers if sufficient labeled data are available. However, currently used statistical dependency parsers provide poor results for words separated by long distances. In order to solve this problem, this article presents an effective dependency parsing approach of incorporating short dependency information from unlabeled data. The unlabeled data is automatically parsed by using a deterministic dependency parser, which exhibits a relatively high performance for short dependencies between words. We then train another parser that uses the information on short dependency relations extracted from the output of the first parser. The proposed approach achieves an unlabeled attachment score of 86.52%, an absolute 1.24% improvement over the baseline system on the Chinese Treebank data set. The results indicate that the proposed approach improves the parsing performance for longer distance words.

UR - http://www.scopus.com/inward/record.url?scp=70349084103&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70349084103&partnerID=8YFLogxK

U2 - 10.1145/1568292.1568293

DO - 10.1145/1568292.1568293

M3 - Article

AN - SCOPUS:70349084103

VL - 8

JO - ACM Transactions on Asian Language Information Processing

JF - ACM Transactions on Asian Language Information Processing

SN - 1530-0226

IS - 3

M1 - 10

ER -