TY - GEN
T1 - Algorithms for estimation of comic speakers considering reading order of frames and texts
AU - Omori, Yuga
AU - Nagamizo, Kota
AU - Ikeda, Daisuke
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Machine learning methods in recent years have focused on multimodal input and cross-modal tasks, and they are used as approaches to problems in various domains. Associating comic texts and characters using these approaches is informative for commercial activities such as speech synthesis and automatic translation of texts. In this study, we address the task of associating a text with a speaker in comics. It is challenging to correspond between them because these are not self-evidently attached, and few studies have attempted. These previous studies have less considered the continuity of comics such as narrative flow or contextual information. We assume that considering the continuity of comics is effective for speaker estimation. This paper proposes algorithms for estimating the reading order of frames or texts, and it also proposes methods for estimating speakers based on these orders. As a result, our proposed method improves accuracy compared to previous methods. Consideration of the frame order is an effective clue to the comic speaker estimation.
AB - Machine learning methods in recent years have focused on multimodal input and cross-modal tasks, and they are used as approaches to problems in various domains. Associating comic texts and characters using these approaches is informative for commercial activities such as speech synthesis and automatic translation of texts. In this study, we address the task of associating a text with a speaker in comics. It is challenging to correspond between them because these are not self-evidently attached, and few studies have attempted. These previous studies have less considered the continuity of comics such as narrative flow or contextual information. We assume that considering the continuity of comics is effective for speaker estimation. This paper proposes algorithms for estimating the reading order of frames or texts, and it also proposes methods for estimating speakers based on these orders. As a result, our proposed method improves accuracy compared to previous methods. Consideration of the frame order is an effective clue to the comic speaker estimation.
UR - http://www.scopus.com/inward/record.url?scp=85139545316&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85139545316&partnerID=8YFLogxK
U2 - 10.1109/IIAIAAI55812.2022.00080
DO - 10.1109/IIAIAAI55812.2022.00080
M3 - Conference contribution
AN - SCOPUS:85139545316
T3 - Proceedings - 2022 12th International Congress on Advanced Applied Informatics, IIAI-AAI 2022
SP - 367
EP - 372
BT - Proceedings - 2022 12th International Congress on Advanced Applied Informatics, IIAI-AAI 2022
A2 - Matsuo, Tokuro
A2 - Takamatsu, Kunihiko
A2 - Ono, Yuichi
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 12th International Congress on Advanced Applied Informatics, IIAI-AAI 2022
Y2 - 2 July 2022 through 7 July 2022
ER -