Detecting academic plagiarism with graphs

Bin Hui Chou, Einoshin Suzuki

Research output: Contribution to journalConference article

Abstract

In this paper, we tackle the problem of detecting academic plagiarism, which is considered as a severe problem owing to the convenience of online publishing. Typical information retrieval methods, stopword-based methods and fingerprinting methods, are commonly used to detect plagiarism by using the sequence of words as they appear in the article. As such, they fail to detect plagiarism when an author reconstructs a source article by re-ordering and recombining phrases. Because graph structure fits for representing relationships between entities, we propose a novel plagiarism detection method, in which we use graphs to represent documents by modeling grammatical relationships between words. Experimental results show that our proposed method outperforms two n-gram methods and increases recall values by 10 to 20%.

Original languageEnglish
Pages (from-to)293-304
Number of pages12
JournalRevue des Nouvelles Technologies de l'Information
VolumeE.24
Publication statusPublished - May 13 2013
Event13emes Journees Francophones sur l'Extraction et la Gestion des Connaissances, EGC 2013 - 13th French-Speaking Conference on Knowledge Discovery and Management, EGC 2013 - Toulouse, France
Duration: Jan 29 2013Feb 1 2013

Fingerprint

Information retrieval
detection method
method
modeling

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Computer Science Applications
  • Information Systems
  • Software

Cite this

Detecting academic plagiarism with graphs. / Chou, Bin Hui; Suzuki, Einoshin.

In: Revue des Nouvelles Technologies de l'Information, Vol. E.24, 13.05.2013, p. 293-304.

Research output: Contribution to journalConference article

@article{fa0d4aab651746df90da7e67b46c0a09,
title = "Detecting academic plagiarism with graphs",
abstract = "In this paper, we tackle the problem of detecting academic plagiarism, which is considered as a severe problem owing to the convenience of online publishing. Typical information retrieval methods, stopword-based methods and fingerprinting methods, are commonly used to detect plagiarism by using the sequence of words as they appear in the article. As such, they fail to detect plagiarism when an author reconstructs a source article by re-ordering and recombining phrases. Because graph structure fits for representing relationships between entities, we propose a novel plagiarism detection method, in which we use graphs to represent documents by modeling grammatical relationships between words. Experimental results show that our proposed method outperforms two n-gram methods and increases recall values by 10 to 20{\%}.",
author = "Chou, {Bin Hui} and Einoshin Suzuki",
year = "2013",
month = "5",
day = "13",
language = "English",
volume = "E.24",
pages = "293--304",
journal = "Revue des Nouvelles Technologies de l'Information",
issn = "1764-1667",

}

TY - JOUR

T1 - Detecting academic plagiarism with graphs

AU - Chou, Bin Hui

AU - Suzuki, Einoshin

PY - 2013/5/13

Y1 - 2013/5/13

N2 - In this paper, we tackle the problem of detecting academic plagiarism, which is considered as a severe problem owing to the convenience of online publishing. Typical information retrieval methods, stopword-based methods and fingerprinting methods, are commonly used to detect plagiarism by using the sequence of words as they appear in the article. As such, they fail to detect plagiarism when an author reconstructs a source article by re-ordering and recombining phrases. Because graph structure fits for representing relationships between entities, we propose a novel plagiarism detection method, in which we use graphs to represent documents by modeling grammatical relationships between words. Experimental results show that our proposed method outperforms two n-gram methods and increases recall values by 10 to 20%.

AB - In this paper, we tackle the problem of detecting academic plagiarism, which is considered as a severe problem owing to the convenience of online publishing. Typical information retrieval methods, stopword-based methods and fingerprinting methods, are commonly used to detect plagiarism by using the sequence of words as they appear in the article. As such, they fail to detect plagiarism when an author reconstructs a source article by re-ordering and recombining phrases. Because graph structure fits for representing relationships between entities, we propose a novel plagiarism detection method, in which we use graphs to represent documents by modeling grammatical relationships between words. Experimental results show that our proposed method outperforms two n-gram methods and increases recall values by 10 to 20%.

UR - http://www.scopus.com/inward/record.url?scp=84877312575&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84877312575&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:84877312575

VL - E.24

SP - 293

EP - 304

JO - Revue des Nouvelles Technologies de l'Information

JF - Revue des Nouvelles Technologies de l'Information

SN - 1764-1667

ER -