A dynamic-static approach of model fusion for document similarity computation

Jiyi Li, Yasuhito Asano, Toshiyuki Shimizu, Masatoshi Yoshikawa

研究成果: Chapter in Book/Report/Conference proceedingConference contribution


The semantic similarity of text document pairs can be used for valuable applications. There are various existing basic models proposed for representing document content and computing document similarity. Each basic model performs difference in different scenarios. Existing model selection or fusion approaches generate improved models based on these basic models on the granularity of document collection. These improved models are static for all document pairs and may be only proper for some of the document pairs. We propose a dynamic idea of model fusion, and an approach based on a Dynamic-Static Fusion Model (DSFM) on the granularity of document pairs, which is dynamic for each document pair. The dynamic module in DSFM learns to rank the basic models to predict the best basic model for a given document pair. We propose a model categorization method to construct ideal model labels of document pairs for learning in this dynamic module. The static module in DSFM is based on linear regression. We also propose a model selection method to select appropriate candidate basic models for fusion and improve the performance. The experiments on public document collections which contain paragraph pairs and sentence pairs with human-rated similarity illustrate the effectiveness of our approach.

ホスト出版物のタイトルWeb Information Systems Engineering – WISE 2015 - 16th International Conference, Proceedings
編集者Shu-Ching Chen, Tao Li, Hua Wang, Yanchun Zhang, Wojciech Cellary, Dingding Wang, Wojciech Cellary, Shu-Ching Chen, Tao Li, Dingding Wang, Jianyong Wang, Jianyong Wang, Hua Wang, Yanchun Zhang
出版社Springer Verlag
ISBN(印刷版)9783319261898, 9783319261898
出版ステータス出版済み - 2015
イベント16th International Conference on Web Information Systems Engineering, WISE 2015 - Miami, 米国
継続期間: 11 1 201511 3 2015


名前Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)


会議16th International Conference on Web Information Systems Engineering, WISE 2015

All Science Journal Classification (ASJC) codes

  • 理論的コンピュータサイエンス
  • コンピュータ サイエンス(全般)


「A dynamic-static approach of model fusion for document similarity computation」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。