Building a diverse document leads corpus annotated with semantic relations

Masatsugu Hangyo, Daisuke Kawahara, Sadao Kurohashi

研究成果: 著書/レポートタイプへの貢献会議での発言

11 引用 (Scopus)

抄録

In these days, semantic analysis has been actively studied in natural language processing. For the study of semantic analysis, corpora with semantic annotations are essential. Although there are such corpora annotated on newspaper articles, there are various genres and styles, including linguistic expressions that are not found in newspaper articles. In this paper, we build a diverse document leads corpus annotated with semantic relations. To reduce the workload of annotators and annotate as many various documents as possible, we restrict the annotation target of each document to only the first three sentences. We have completed building a corpus of 1,000 documents and report the statistics of this corpus.

元の言語英語
ホスト出版物のタイトルProceedings of the 26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012
ページ535-544
ページ数10
出版物ステータス出版済み - 12 1 2012
イベント26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012 - Bali, インドネシア
継続期間: 11 7 201211 7 2012

出版物シリーズ

名前Proceedings of the 26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012

その他

その他26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012
インドネシア
Bali
期間11/7/1211/7/12

Fingerprint

Semantics
Linguistics
Statistics
Processing

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Software

これを引用

Hangyo, M., Kawahara, D., & Kurohashi, S. (2012). Building a diverse document leads corpus annotated with semantic relations. : Proceedings of the 26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012 (pp. 535-544). (Proceedings of the 26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012).

Building a diverse document leads corpus annotated with semantic relations. / Hangyo, Masatsugu; Kawahara, Daisuke; Kurohashi, Sadao.

Proceedings of the 26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012. 2012. p. 535-544 (Proceedings of the 26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012).

研究成果: 著書/レポートタイプへの貢献会議での発言

Hangyo, M, Kawahara, D & Kurohashi, S 2012, Building a diverse document leads corpus annotated with semantic relations. : Proceedings of the 26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012. Proceedings of the 26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012, pp. 535-544, 26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012, Bali, インドネシア, 11/7/12.
Hangyo M, Kawahara D, Kurohashi S. Building a diverse document leads corpus annotated with semantic relations. : Proceedings of the 26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012. 2012. p. 535-544. (Proceedings of the 26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012).
Hangyo, Masatsugu ; Kawahara, Daisuke ; Kurohashi, Sadao. / Building a diverse document leads corpus annotated with semantic relations. Proceedings of the 26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012. 2012. pp. 535-544 (Proceedings of the 26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012).
@inproceedings{baeafe4624e4411bb0a37ebe07780553,
title = "Building a diverse document leads corpus annotated with semantic relations",
abstract = "In these days, semantic analysis has been actively studied in natural language processing. For the study of semantic analysis, corpora with semantic annotations are essential. Although there are such corpora annotated on newspaper articles, there are various genres and styles, including linguistic expressions that are not found in newspaper articles. In this paper, we build a diverse document leads corpus annotated with semantic relations. To reduce the workload of annotators and annotate as many various documents as possible, we restrict the annotation target of each document to only the first three sentences. We have completed building a corpus of 1,000 documents and report the statistics of this corpus.",
author = "Masatsugu Hangyo and Daisuke Kawahara and Sadao Kurohashi",
year = "2012",
month = "12",
day = "1",
language = "English",
isbn = "9789791421171",
series = "Proceedings of the 26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012",
pages = "535--544",
booktitle = "Proceedings of the 26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012",

}

TY - GEN

T1 - Building a diverse document leads corpus annotated with semantic relations

AU - Hangyo, Masatsugu

AU - Kawahara, Daisuke

AU - Kurohashi, Sadao

PY - 2012/12/1

Y1 - 2012/12/1

N2 - In these days, semantic analysis has been actively studied in natural language processing. For the study of semantic analysis, corpora with semantic annotations are essential. Although there are such corpora annotated on newspaper articles, there are various genres and styles, including linguistic expressions that are not found in newspaper articles. In this paper, we build a diverse document leads corpus annotated with semantic relations. To reduce the workload of annotators and annotate as many various documents as possible, we restrict the annotation target of each document to only the first three sentences. We have completed building a corpus of 1,000 documents and report the statistics of this corpus.

AB - In these days, semantic analysis has been actively studied in natural language processing. For the study of semantic analysis, corpora with semantic annotations are essential. Although there are such corpora annotated on newspaper articles, there are various genres and styles, including linguistic expressions that are not found in newspaper articles. In this paper, we build a diverse document leads corpus annotated with semantic relations. To reduce the workload of annotators and annotate as many various documents as possible, we restrict the annotation target of each document to only the first three sentences. We have completed building a corpus of 1,000 documents and report the statistics of this corpus.

UR - http://www.scopus.com/inward/record.url?scp=84883341328&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84883341328&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84883341328

SN - 9789791421171

T3 - Proceedings of the 26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012

SP - 535

EP - 544

BT - Proceedings of the 26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012

ER -