A method for fine-grained document alignment using structural information

Naoki Tsujio, Toshiyuki Shimizu, Masatoshi Yoshikawa

研究成果: Chapter in Book/Report/Conference proceedingConference contribution

1 被引用数 (Scopus)

抄録

It is useful to understand the corresponding relationships between each part of related documents, such as a conference paper and its modified version published as a journal paper, or documents in different versions. However, it is hard to associate corresponding parts which have been heavily modified only using similarity in their content. We propose a method of aligning documents considering not only content information but also structural information in documents. Our method consists of three steps; baseline alignment considering document order, merging, and swapping. We used papers which have been presented at a domestic conference and an international conference, then obtained their alignments by using several methods in our evaluation experiments. The results revealed the effectiveness of the use of document structures.

本文言語英語
ホスト出版物のタイトルWeb Technologies and Applications - 16th Asia-Pacific Web Conference, APWeb 2014, Proceedings
出版社Springer Verlag
ページ201-211
ページ数11
ISBN(印刷版)9783319111155
DOI
出版ステータス出版済み - 2014
外部発表はい
イベント16th Asia-Pacific Web Conference on Web Technologies and Applications, APWeb 2014 - Changsha, 中国
継続期間: 9 5 20149 7 2014

出版物シリーズ

名前Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
8709 LNCS
ISSN(印刷版)0302-9743
ISSN(電子版)1611-3349

会議

会議16th Asia-Pacific Web Conference on Web Technologies and Applications, APWeb 2014
国/地域中国
CityChangsha
Period9/5/149/7/14

All Science Journal Classification (ASJC) codes

  • 理論的コンピュータサイエンス
  • コンピュータ サイエンス(全般)

フィンガープリント

「A method for fine-grained document alignment using structural information」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル