Contextualized word representations for multi-sense embedding

Kazuki Ashihara, Tomoyuki Kajiwara, Yuki Arase, Satoru Uchida

研究成果: Contribution to conferencePaper査読

1 被引用数 (Scopus)

抄録

Distributed word representations are usedin many natural language processing tasks.When dealing with ambiguous words, it is desired to generate multi-sense embeddings, i.e.,multiple representations per word. Therefore,several methods have been proposed to generate different word representations based onparts of speech or topic, but these methodstend to be too coarse to deal with ambiguity.In this paper, we propose methods to generatemultiple word representations for each wordbased on dependency structure relations. Inorder to deal with the data sparseness problem due to the increase in the size of vocabulary, the initial value for each word representations is determined using pre-trained wordrepresentations. It is expected that the representations of low frequency words will remainin the vicinity of the initial value, which will inturn reduce the negative effects of data sparseness. Extensive evaluation results confirmthe effectiveness of our methods that significantly outperformed state-of-the-art methodsfor multi-sense embeddings. Detailed analysisof our method shows that the data sparsenessproblem is resolved due to the pre-training.

本文言語英語
ページ28-36
ページ数9
出版ステータス出版済み - 2018
イベント32nd Pacific Asia Conference on Language, Information and Computation, PACLIC 2018 - Hong Kong, 香港
継続期間: 12 1 201812 3 2018

会議

会議32nd Pacific Asia Conference on Language, Information and Computation, PACLIC 2018
Country香港
CityHong Kong
Period12/1/1812/3/18

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Computer Science (miscellaneous)

フィンガープリント 「Contextualized word representations for multi-sense embedding」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル