Survey of conversational behavior: Towards the design of a balanced corpus of everyday Japanese conversation

Hanae Koisot, Tomoyuki Tsuchiya, Ryoko Watanabet, Daisuke Yokomori, Masao Aizawa, Yasuharu Den

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

In 2016, we set about building a large-scale corpus of everyday Japanese conversation-a collection of conversations embedded in naturally occurring activities in daily life. We will collect more than 200 hours of recordings over six years, publishing the corpus in 2022. To construct such a huge corpus, we have conducted a pilot project, one of whose purposes is to establish a corpus design for collecting various kinds of everyday conversations in a balanced manner. For this purpose, we conducted a survey of everyday conversational behavior, with about 250 adults, in order to reveal how diverse our everyday conversational behavior is and to build an empirical foundation for corpus design. The questionnaire included when, where, how long, with whom, and in what kind of activity informants were engaged in conversations. We found that ordinary conversations show the following tendencies: i) they mainly consist of chats, business talks, and consultations; ii) in general, the number of participants is small and the duration of the conversation is short; iii) many conversations are conducted in private places such as homes, as well as in public places such as offices and schools; and iv) some questionnaire items are related to each other. This paper describes an overview of this survey study, and then discusses how to design a large-scale corpus of everyday Japanese conversation on this basis.

Original languageEnglish
Title of host publicationProceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016
EditorsNicoletta Calzolari, Khalid Choukri, Helene Mazo, Asuncion Moreno, Thierry Declerck, Sara Goggi, Marko Grobelnik, Jan Odijk, Stelios Piperidis, Bente Maegaard, Joseph Mariani
PublisherEuropean Language Resources Association (ELRA)
Pages4434-4439
Number of pages6
ISBN (Electronic)9782951740891
Publication statusPublished - Jan 1 2016
Event10th International Conference on Language Resources and Evaluation, LREC 2016 - Portoroz, Slovenia
Duration: May 23 2016May 28 2016

Publication series

NameProceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016

Other

Other10th International Conference on Language Resources and Evaluation, LREC 2016
CountrySlovenia
CityPortoroz
Period5/23/165/28/16

Fingerprint

conversation
Japanese Conversation
questionnaire
chat
pilot project
recording
school
Questionnaire

All Science Journal Classification (ASJC) codes

  • Linguistics and Language
  • Library and Information Sciences
  • Language and Linguistics
  • Education

Cite this

Koisot, H., Tsuchiya, T., Watanabet, R., Yokomori, D., Aizawa, M., & Den, Y. (2016). Survey of conversational behavior: Towards the design of a balanced corpus of everyday Japanese conversation. In N. Calzolari, K. Choukri, H. Mazo, A. Moreno, T. Declerck, S. Goggi, M. Grobelnik, J. Odijk, S. Piperidis, B. Maegaard, ... J. Mariani (Eds.), Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016 (pp. 4434-4439). (Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016). European Language Resources Association (ELRA).

Survey of conversational behavior : Towards the design of a balanced corpus of everyday Japanese conversation. / Koisot, Hanae; Tsuchiya, Tomoyuki; Watanabet, Ryoko; Yokomori, Daisuke; Aizawa, Masao; Den, Yasuharu.

Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016. ed. / Nicoletta Calzolari; Khalid Choukri; Helene Mazo; Asuncion Moreno; Thierry Declerck; Sara Goggi; Marko Grobelnik; Jan Odijk; Stelios Piperidis; Bente Maegaard; Joseph Mariani. European Language Resources Association (ELRA), 2016. p. 4434-4439 (Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Koisot, H, Tsuchiya, T, Watanabet, R, Yokomori, D, Aizawa, M & Den, Y 2016, Survey of conversational behavior: Towards the design of a balanced corpus of everyday Japanese conversation. in N Calzolari, K Choukri, H Mazo, A Moreno, T Declerck, S Goggi, M Grobelnik, J Odijk, S Piperidis, B Maegaard & J Mariani (eds), Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016. Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016, European Language Resources Association (ELRA), pp. 4434-4439, 10th International Conference on Language Resources and Evaluation, LREC 2016, Portoroz, Slovenia, 5/23/16.
Koisot H, Tsuchiya T, Watanabet R, Yokomori D, Aizawa M, Den Y. Survey of conversational behavior: Towards the design of a balanced corpus of everyday Japanese conversation. In Calzolari N, Choukri K, Mazo H, Moreno A, Declerck T, Goggi S, Grobelnik M, Odijk J, Piperidis S, Maegaard B, Mariani J, editors, Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016. European Language Resources Association (ELRA). 2016. p. 4434-4439. (Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016).
Koisot, Hanae ; Tsuchiya, Tomoyuki ; Watanabet, Ryoko ; Yokomori, Daisuke ; Aizawa, Masao ; Den, Yasuharu. / Survey of conversational behavior : Towards the design of a balanced corpus of everyday Japanese conversation. Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016. editor / Nicoletta Calzolari ; Khalid Choukri ; Helene Mazo ; Asuncion Moreno ; Thierry Declerck ; Sara Goggi ; Marko Grobelnik ; Jan Odijk ; Stelios Piperidis ; Bente Maegaard ; Joseph Mariani. European Language Resources Association (ELRA), 2016. pp. 4434-4439 (Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016).
@inproceedings{e9878d79b4c04aa9bcf4852a61909e98,
title = "Survey of conversational behavior: Towards the design of a balanced corpus of everyday Japanese conversation",
abstract = "In 2016, we set about building a large-scale corpus of everyday Japanese conversation-a collection of conversations embedded in naturally occurring activities in daily life. We will collect more than 200 hours of recordings over six years, publishing the corpus in 2022. To construct such a huge corpus, we have conducted a pilot project, one of whose purposes is to establish a corpus design for collecting various kinds of everyday conversations in a balanced manner. For this purpose, we conducted a survey of everyday conversational behavior, with about 250 adults, in order to reveal how diverse our everyday conversational behavior is and to build an empirical foundation for corpus design. The questionnaire included when, where, how long, with whom, and in what kind of activity informants were engaged in conversations. We found that ordinary conversations show the following tendencies: i) they mainly consist of chats, business talks, and consultations; ii) in general, the number of participants is small and the duration of the conversation is short; iii) many conversations are conducted in private places such as homes, as well as in public places such as offices and schools; and iv) some questionnaire items are related to each other. This paper describes an overview of this survey study, and then discusses how to design a large-scale corpus of everyday Japanese conversation on this basis.",
author = "Hanae Koisot and Tomoyuki Tsuchiya and Ryoko Watanabet and Daisuke Yokomori and Masao Aizawa and Yasuharu Den",
year = "2016",
month = "1",
day = "1",
language = "English",
series = "Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016",
publisher = "European Language Resources Association (ELRA)",
pages = "4434--4439",
editor = "Nicoletta Calzolari and Khalid Choukri and Helene Mazo and Asuncion Moreno and Thierry Declerck and Sara Goggi and Marko Grobelnik and Jan Odijk and Stelios Piperidis and Bente Maegaard and Joseph Mariani",
booktitle = "Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016",

}

TY - GEN

T1 - Survey of conversational behavior

T2 - Towards the design of a balanced corpus of everyday Japanese conversation

AU - Koisot, Hanae

AU - Tsuchiya, Tomoyuki

AU - Watanabet, Ryoko

AU - Yokomori, Daisuke

AU - Aizawa, Masao

AU - Den, Yasuharu

PY - 2016/1/1

Y1 - 2016/1/1

N2 - In 2016, we set about building a large-scale corpus of everyday Japanese conversation-a collection of conversations embedded in naturally occurring activities in daily life. We will collect more than 200 hours of recordings over six years, publishing the corpus in 2022. To construct such a huge corpus, we have conducted a pilot project, one of whose purposes is to establish a corpus design for collecting various kinds of everyday conversations in a balanced manner. For this purpose, we conducted a survey of everyday conversational behavior, with about 250 adults, in order to reveal how diverse our everyday conversational behavior is and to build an empirical foundation for corpus design. The questionnaire included when, where, how long, with whom, and in what kind of activity informants were engaged in conversations. We found that ordinary conversations show the following tendencies: i) they mainly consist of chats, business talks, and consultations; ii) in general, the number of participants is small and the duration of the conversation is short; iii) many conversations are conducted in private places such as homes, as well as in public places such as offices and schools; and iv) some questionnaire items are related to each other. This paper describes an overview of this survey study, and then discusses how to design a large-scale corpus of everyday Japanese conversation on this basis.

AB - In 2016, we set about building a large-scale corpus of everyday Japanese conversation-a collection of conversations embedded in naturally occurring activities in daily life. We will collect more than 200 hours of recordings over six years, publishing the corpus in 2022. To construct such a huge corpus, we have conducted a pilot project, one of whose purposes is to establish a corpus design for collecting various kinds of everyday conversations in a balanced manner. For this purpose, we conducted a survey of everyday conversational behavior, with about 250 adults, in order to reveal how diverse our everyday conversational behavior is and to build an empirical foundation for corpus design. The questionnaire included when, where, how long, with whom, and in what kind of activity informants were engaged in conversations. We found that ordinary conversations show the following tendencies: i) they mainly consist of chats, business talks, and consultations; ii) in general, the number of participants is small and the duration of the conversation is short; iii) many conversations are conducted in private places such as homes, as well as in public places such as offices and schools; and iv) some questionnaire items are related to each other. This paper describes an overview of this survey study, and then discusses how to design a large-scale corpus of everyday Japanese conversation on this basis.

UR - http://www.scopus.com/inward/record.url?scp=85037061332&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85037061332&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85037061332

T3 - Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016

SP - 4434

EP - 4439

BT - Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016

A2 - Calzolari, Nicoletta

A2 - Choukri, Khalid

A2 - Mazo, Helene

A2 - Moreno, Asuncion

A2 - Declerck, Thierry

A2 - Goggi, Sara

A2 - Grobelnik, Marko

A2 - Odijk, Jan

A2 - Piperidis, Stelios

A2 - Maegaard, Bente

A2 - Mariani, Joseph

PB - European Language Resources Association (ELRA)

ER -