Toward Three-Stage Automation of Annotation for Human Values

Emi Ishita, Satoshi Fukuda, Toru Oga, Douglas W. Oard, Kenneth R. Fleischmann, Yoichi Tomiura, An Shou Cheng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Prior work on automated annotation of human values has sought to train text classification techniques to label text spans with labels that reflect specific human values such as freedom, justice, or safety. This confounds three tasks: (1) selecting the documents to be labeled, (2) selecting the text spans that express or reflect human values, and (3) assigning labels to those spans. This paper proposes a three-stage model in which separate systems can be optimally trained for each of the three stages. Experiments from the first stage, document selection, indicate that annotation diversity trumps annotation quality, suggesting that when multiple annotators are available, the traditional practice of adjudicating conflicting annotations of the same documents is not as cost effective as an alternative in which each annotator labels different documents. Preliminary results for the second stage, selecting value sentences, indicate that high recall (94%) can be achieved on that task with levels of precision (above 80%) that seem suitable for use as part of a multi-stage annotation pipeline. The annotations created for these experiments are being made freely available, and the content that was annotated is available from commercial sources at modest cost.

Original languageEnglish
Title of host publicationInformation in Contemporary Society - 14th International Conference, iConference 2019, Proceedings
EditorsNatalie Greene Taylor, Caitlin Christian-Lamb, Bonnie Nardi, Michelle H. Martin
PublisherSpringer Verlag
Pages188-199
Number of pages12
ISBN (Print)9783030157418
DOIs
Publication statusPublished - Jan 1 2019
Event14th International Conference on Information in Contemporary Society, iConference 2019 - Washington, United States
Duration: Mar 31 2019Apr 3 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11420 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference14th International Conference on Information in Contemporary Society, iConference 2019
CountryUnited States
CityWashington
Period3/31/194/3/19

Fingerprint

Automation
Annotation
Labels
Costs
Text Classification
Pipelines
Experiments
Experiment
Human
Express
Safety
Alternatives
Text

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Ishita, E., Fukuda, S., Oga, T., Oard, D. W., Fleischmann, K. R., Tomiura, Y., & Cheng, A. S. (2019). Toward Three-Stage Automation of Annotation for Human Values. In N. G. Taylor, C. Christian-Lamb, B. Nardi, & M. H. Martin (Eds.), Information in Contemporary Society - 14th International Conference, iConference 2019, Proceedings (pp. 188-199). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11420 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-030-15742-5_18

Toward Three-Stage Automation of Annotation for Human Values. / Ishita, Emi; Fukuda, Satoshi; Oga, Toru; Oard, Douglas W.; Fleischmann, Kenneth R.; Tomiura, Yoichi; Cheng, An Shou.

Information in Contemporary Society - 14th International Conference, iConference 2019, Proceedings. ed. / Natalie Greene Taylor; Caitlin Christian-Lamb; Bonnie Nardi; Michelle H. Martin. Springer Verlag, 2019. p. 188-199 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11420 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ishita, E, Fukuda, S, Oga, T, Oard, DW, Fleischmann, KR, Tomiura, Y & Cheng, AS 2019, Toward Three-Stage Automation of Annotation for Human Values. in NG Taylor, C Christian-Lamb, B Nardi & MH Martin (eds), Information in Contemporary Society - 14th International Conference, iConference 2019, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11420 LNCS, Springer Verlag, pp. 188-199, 14th International Conference on Information in Contemporary Society, iConference 2019, Washington, United States, 3/31/19. https://doi.org/10.1007/978-3-030-15742-5_18
Ishita E, Fukuda S, Oga T, Oard DW, Fleischmann KR, Tomiura Y et al. Toward Three-Stage Automation of Annotation for Human Values. In Taylor NG, Christian-Lamb C, Nardi B, Martin MH, editors, Information in Contemporary Society - 14th International Conference, iConference 2019, Proceedings. Springer Verlag. 2019. p. 188-199. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-15742-5_18
Ishita, Emi ; Fukuda, Satoshi ; Oga, Toru ; Oard, Douglas W. ; Fleischmann, Kenneth R. ; Tomiura, Yoichi ; Cheng, An Shou. / Toward Three-Stage Automation of Annotation for Human Values. Information in Contemporary Society - 14th International Conference, iConference 2019, Proceedings. editor / Natalie Greene Taylor ; Caitlin Christian-Lamb ; Bonnie Nardi ; Michelle H. Martin. Springer Verlag, 2019. pp. 188-199 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{e9c0e4d1371941068ba637ba0e30bbc1,
title = "Toward Three-Stage Automation of Annotation for Human Values",
abstract = "Prior work on automated annotation of human values has sought to train text classification techniques to label text spans with labels that reflect specific human values such as freedom, justice, or safety. This confounds three tasks: (1) selecting the documents to be labeled, (2) selecting the text spans that express or reflect human values, and (3) assigning labels to those spans. This paper proposes a three-stage model in which separate systems can be optimally trained for each of the three stages. Experiments from the first stage, document selection, indicate that annotation diversity trumps annotation quality, suggesting that when multiple annotators are available, the traditional practice of adjudicating conflicting annotations of the same documents is not as cost effective as an alternative in which each annotator labels different documents. Preliminary results for the second stage, selecting value sentences, indicate that high recall (94{\%}) can be achieved on that task with levels of precision (above 80{\%}) that seem suitable for use as part of a multi-stage annotation pipeline. The annotations created for these experiments are being made freely available, and the content that was annotated is available from commercial sources at modest cost.",
author = "Emi Ishita and Satoshi Fukuda and Toru Oga and Oard, {Douglas W.} and Fleischmann, {Kenneth R.} and Yoichi Tomiura and Cheng, {An Shou}",
year = "2019",
month = "1",
day = "1",
doi = "10.1007/978-3-030-15742-5_18",
language = "English",
isbn = "9783030157418",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "188--199",
editor = "Taylor, {Natalie Greene} and Caitlin Christian-Lamb and Bonnie Nardi and Martin, {Michelle H.}",
booktitle = "Information in Contemporary Society - 14th International Conference, iConference 2019, Proceedings",
address = "Germany",

}

TY - GEN

T1 - Toward Three-Stage Automation of Annotation for Human Values

AU - Ishita, Emi

AU - Fukuda, Satoshi

AU - Oga, Toru

AU - Oard, Douglas W.

AU - Fleischmann, Kenneth R.

AU - Tomiura, Yoichi

AU - Cheng, An Shou

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Prior work on automated annotation of human values has sought to train text classification techniques to label text spans with labels that reflect specific human values such as freedom, justice, or safety. This confounds three tasks: (1) selecting the documents to be labeled, (2) selecting the text spans that express or reflect human values, and (3) assigning labels to those spans. This paper proposes a three-stage model in which separate systems can be optimally trained for each of the three stages. Experiments from the first stage, document selection, indicate that annotation diversity trumps annotation quality, suggesting that when multiple annotators are available, the traditional practice of adjudicating conflicting annotations of the same documents is not as cost effective as an alternative in which each annotator labels different documents. Preliminary results for the second stage, selecting value sentences, indicate that high recall (94%) can be achieved on that task with levels of precision (above 80%) that seem suitable for use as part of a multi-stage annotation pipeline. The annotations created for these experiments are being made freely available, and the content that was annotated is available from commercial sources at modest cost.

AB - Prior work on automated annotation of human values has sought to train text classification techniques to label text spans with labels that reflect specific human values such as freedom, justice, or safety. This confounds three tasks: (1) selecting the documents to be labeled, (2) selecting the text spans that express or reflect human values, and (3) assigning labels to those spans. This paper proposes a three-stage model in which separate systems can be optimally trained for each of the three stages. Experiments from the first stage, document selection, indicate that annotation diversity trumps annotation quality, suggesting that when multiple annotators are available, the traditional practice of adjudicating conflicting annotations of the same documents is not as cost effective as an alternative in which each annotator labels different documents. Preliminary results for the second stage, selecting value sentences, indicate that high recall (94%) can be achieved on that task with levels of precision (above 80%) that seem suitable for use as part of a multi-stage annotation pipeline. The annotations created for these experiments are being made freely available, and the content that was annotated is available from commercial sources at modest cost.

UR - http://www.scopus.com/inward/record.url?scp=85064044280&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85064044280&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-15742-5_18

DO - 10.1007/978-3-030-15742-5_18

M3 - Conference contribution

AN - SCOPUS:85064044280

SN - 9783030157418

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 188

EP - 199

BT - Information in Contemporary Society - 14th International Conference, iConference 2019, Proceedings

A2 - Taylor, Natalie Greene

A2 - Christian-Lamb, Caitlin

A2 - Nardi, Bonnie

A2 - Martin, Michelle H.

PB - Springer Verlag

ER -