Plagiarism detection using document similarity based on distributed representation

Kensuke Baba, Tetsuya Nakatoh, Toshiro Minami

研究成果: Contribution to journalConference article

3 引用 (Scopus)

抜粋

Accurate methods are required for plagiarism detection from documents. Generally, plagiarism detection is implemented on the basis of similarity between documents. This paper evaluates the validity of using distributed representation of words for defining a document similarity. This paper proposes a plagiarism detection method based on the local maximal value of the length of the longest common subsequence (LCS) with the weight defined by a distributed representation. The proposed method and other two straightforward methods, which are based on the simple length of LCS and the local maximal value of LCS with no weight, are applied to the dataset of a plagiarism detection competition. The experimental results show that the proposed method is useful in the applications that need a strict detection of complex plagiarisms.

元の言語英語
ページ(範囲)382-387
ページ数6
ジャーナルProcedia Computer Science
111
DOI
出版物ステータス出版済み - 1 1 2017
イベント8th International Conference on Advances in Information Technology, IAIT 2016 - , マカオ
継続期間: 12 19 201612 22 2016

All Science Journal Classification (ASJC) codes

  • Computer Science(all)

これを引用