Toward Three-Stage Automation of Annotation for Human Values

Emi Ishita, Satoshi Fukuda, Toru Oga, Douglas W. Oard, Kenneth R. Fleischmann, Yoichi Tomiura, An Shou Cheng

研究成果: 著書/レポートタイプへの貢献会議での発言

抄録

Prior work on automated annotation of human values has sought to train text classification techniques to label text spans with labels that reflect specific human values such as freedom, justice, or safety. This confounds three tasks: (1) selecting the documents to be labeled, (2) selecting the text spans that express or reflect human values, and (3) assigning labels to those spans. This paper proposes a three-stage model in which separate systems can be optimally trained for each of the three stages. Experiments from the first stage, document selection, indicate that annotation diversity trumps annotation quality, suggesting that when multiple annotators are available, the traditional practice of adjudicating conflicting annotations of the same documents is not as cost effective as an alternative in which each annotator labels different documents. Preliminary results for the second stage, selecting value sentences, indicate that high recall (94%) can be achieved on that task with levels of precision (above 80%) that seem suitable for use as part of a multi-stage annotation pipeline. The annotations created for these experiments are being made freely available, and the content that was annotated is available from commercial sources at modest cost.

元の言語英語
ホスト出版物のタイトルInformation in Contemporary Society - 14th International Conference, iConference 2019, Proceedings
編集者Natalie Greene Taylor, Caitlin Christian-Lamb, Bonnie Nardi, Michelle H. Martin
出版者Springer Verlag
ページ188-199
ページ数12
ISBN(印刷物)9783030157418
DOI
出版物ステータス出版済み - 1 1 2019
イベント14th International Conference on Information in Contemporary Society, iConference 2019 - Washington, 米国
継続期間: 3 31 20194 3 2019

出版物シリーズ

名前Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
11420 LNCS
ISSN(印刷物)0302-9743
ISSN(電子版)1611-3349

会議

会議14th International Conference on Information in Contemporary Society, iConference 2019
米国
Washington
期間3/31/194/3/19

Fingerprint

Automation
Annotation
Labels
Costs
Text Classification
Pipelines
Experiments
Experiment
Human
Express
Safety
Alternatives
Text

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

これを引用

Ishita, E., Fukuda, S., Oga, T., Oard, D. W., Fleischmann, K. R., Tomiura, Y., & Cheng, A. S. (2019). Toward Three-Stage Automation of Annotation for Human Values. : N. G. Taylor, C. Christian-Lamb, B. Nardi, & M. H. Martin (版), Information in Contemporary Society - 14th International Conference, iConference 2019, Proceedings (pp. 188-199). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 巻数 11420 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-030-15742-5_18

Toward Three-Stage Automation of Annotation for Human Values. / Ishita, Emi; Fukuda, Satoshi; Oga, Toru; Oard, Douglas W.; Fleischmann, Kenneth R.; Tomiura, Yoichi; Cheng, An Shou.

Information in Contemporary Society - 14th International Conference, iConference 2019, Proceedings. 版 / Natalie Greene Taylor; Caitlin Christian-Lamb; Bonnie Nardi; Michelle H. Martin. Springer Verlag, 2019. p. 188-199 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 巻 11420 LNCS).

研究成果: 著書/レポートタイプへの貢献会議での発言

Ishita, E, Fukuda, S, Oga, T, Oard, DW, Fleischmann, KR, Tomiura, Y & Cheng, AS 2019, Toward Three-Stage Automation of Annotation for Human Values. : NG Taylor, C Christian-Lamb, B Nardi & MH Martin (版), Information in Contemporary Society - 14th International Conference, iConference 2019, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 巻. 11420 LNCS, Springer Verlag, pp. 188-199, 14th International Conference on Information in Contemporary Society, iConference 2019, Washington, 米国, 3/31/19. https://doi.org/10.1007/978-3-030-15742-5_18
Ishita E, Fukuda S, Oga T, Oard DW, Fleischmann KR, Tomiura Y その他. Toward Three-Stage Automation of Annotation for Human Values. : Taylor NG, Christian-Lamb C, Nardi B, Martin MH, 編集者, Information in Contemporary Society - 14th International Conference, iConference 2019, Proceedings. Springer Verlag. 2019. p. 188-199. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-15742-5_18
Ishita, Emi ; Fukuda, Satoshi ; Oga, Toru ; Oard, Douglas W. ; Fleischmann, Kenneth R. ; Tomiura, Yoichi ; Cheng, An Shou. / Toward Three-Stage Automation of Annotation for Human Values. Information in Contemporary Society - 14th International Conference, iConference 2019, Proceedings. 編集者 / Natalie Greene Taylor ; Caitlin Christian-Lamb ; Bonnie Nardi ; Michelle H. Martin. Springer Verlag, 2019. pp. 188-199 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{e9c0e4d1371941068ba637ba0e30bbc1,
title = "Toward Three-Stage Automation of Annotation for Human Values",
abstract = "Prior work on automated annotation of human values has sought to train text classification techniques to label text spans with labels that reflect specific human values such as freedom, justice, or safety. This confounds three tasks: (1) selecting the documents to be labeled, (2) selecting the text spans that express or reflect human values, and (3) assigning labels to those spans. This paper proposes a three-stage model in which separate systems can be optimally trained for each of the three stages. Experiments from the first stage, document selection, indicate that annotation diversity trumps annotation quality, suggesting that when multiple annotators are available, the traditional practice of adjudicating conflicting annotations of the same documents is not as cost effective as an alternative in which each annotator labels different documents. Preliminary results for the second stage, selecting value sentences, indicate that high recall (94{\%}) can be achieved on that task with levels of precision (above 80{\%}) that seem suitable for use as part of a multi-stage annotation pipeline. The annotations created for these experiments are being made freely available, and the content that was annotated is available from commercial sources at modest cost.",
author = "Emi Ishita and Satoshi Fukuda and Toru Oga and Oard, {Douglas W.} and Fleischmann, {Kenneth R.} and Yoichi Tomiura and Cheng, {An Shou}",
year = "2019",
month = "1",
day = "1",
doi = "10.1007/978-3-030-15742-5_18",
language = "English",
isbn = "9783030157418",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "188--199",
editor = "Taylor, {Natalie Greene} and Caitlin Christian-Lamb and Bonnie Nardi and Martin, {Michelle H.}",
booktitle = "Information in Contemporary Society - 14th International Conference, iConference 2019, Proceedings",
address = "Germany",

}

TY - GEN

T1 - Toward Three-Stage Automation of Annotation for Human Values

AU - Ishita, Emi

AU - Fukuda, Satoshi

AU - Oga, Toru

AU - Oard, Douglas W.

AU - Fleischmann, Kenneth R.

AU - Tomiura, Yoichi

AU - Cheng, An Shou

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Prior work on automated annotation of human values has sought to train text classification techniques to label text spans with labels that reflect specific human values such as freedom, justice, or safety. This confounds three tasks: (1) selecting the documents to be labeled, (2) selecting the text spans that express or reflect human values, and (3) assigning labels to those spans. This paper proposes a three-stage model in which separate systems can be optimally trained for each of the three stages. Experiments from the first stage, document selection, indicate that annotation diversity trumps annotation quality, suggesting that when multiple annotators are available, the traditional practice of adjudicating conflicting annotations of the same documents is not as cost effective as an alternative in which each annotator labels different documents. Preliminary results for the second stage, selecting value sentences, indicate that high recall (94%) can be achieved on that task with levels of precision (above 80%) that seem suitable for use as part of a multi-stage annotation pipeline. The annotations created for these experiments are being made freely available, and the content that was annotated is available from commercial sources at modest cost.

AB - Prior work on automated annotation of human values has sought to train text classification techniques to label text spans with labels that reflect specific human values such as freedom, justice, or safety. This confounds three tasks: (1) selecting the documents to be labeled, (2) selecting the text spans that express or reflect human values, and (3) assigning labels to those spans. This paper proposes a three-stage model in which separate systems can be optimally trained for each of the three stages. Experiments from the first stage, document selection, indicate that annotation diversity trumps annotation quality, suggesting that when multiple annotators are available, the traditional practice of adjudicating conflicting annotations of the same documents is not as cost effective as an alternative in which each annotator labels different documents. Preliminary results for the second stage, selecting value sentences, indicate that high recall (94%) can be achieved on that task with levels of precision (above 80%) that seem suitable for use as part of a multi-stage annotation pipeline. The annotations created for these experiments are being made freely available, and the content that was annotated is available from commercial sources at modest cost.

UR - http://www.scopus.com/inward/record.url?scp=85064044280&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85064044280&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-15742-5_18

DO - 10.1007/978-3-030-15742-5_18

M3 - Conference contribution

AN - SCOPUS:85064044280

SN - 9783030157418

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 188

EP - 199

BT - Information in Contemporary Society - 14th International Conference, iConference 2019, Proceedings

A2 - Taylor, Natalie Greene

A2 - Christian-Lamb, Caitlin

A2 - Nardi, Bonnie

A2 - Martin, Michelle H.

PB - Springer Verlag

ER -