| Reference |
Topics |
Year |
Link |
|
Paul Clough. Plagiarism in natural and programming languages: an overview of current tools and technologies.
Research Memoranda: CS-00-05, Department of Computer Science, University of Sheffield, UK
|
- Plagiarism basics
- Description of a relevant features on plagiarism analysis
- Description of plagiarism detection tools
|
2000 |
|
|
Paul Clough. Old and new challenges in automatic plagiarism detection
National UK Plagiarism Advisory Service
|
|
2003 |
|
|
Paul Clough, Robert Gaizauskas. Corpora and Text Re-use.
Lüdeling, Kytö and McEnery (eds) Handbook of Corpus Linguistics. (Series: Handbooks of Linguistics and Communication Science), 1249–1271, Mouton de Gruyter.
|
|
2009 |
|
|
Thomas Lancaster, Fintan Culwin. Classifications of Plagiarism Detection Engines.
ITALICS 4 (2)
|
|
2005 |
|
|
Hermann Maurer, Frank Kappe, Bilal Zaka. Plagiarism - A Survey.
Journal of Universal Computer Science 12(8), pp. 1050-1084
|
- Basics on plagiarism analysis
- Definition of plagiarism
- Discussion on the reaction of different institutions
- Description of commercial tools
|
2006 |
|
|
SIGIR 2007 Workshop. Plagiarism Analysis, Authorship Identification and Near-Duplicate Detection.
|
- Proceedings of the PAN 2007 Workshop
|
2007 |
|
| Reference |
Topics |
Year |
Link |
|
Alberto Barrón-Cedeño, Paolo Rosso, David Pinto, Alfons Juan.
On cross-lingual plagiarism analysis using a statistical model.
Proceedings of the ECAI'08 PAN Workshop: Uncovering Plagiarism, Authorship and Social Software Misuse,
pp. 9-13. Patras, Greece
|
- Alignment models exploitation
- Text similarity calculation based on bilingual dictionaries
|
2008 |
|
|
Zdenek Ceska, Michal Toman, Karel Jezek. Multilingual Plagiarism Detection.
Artificial Intelligence: Methodology, Systems, and Applications, LNCS (5253), pp. 83-92
|
|
2008 |
|
|
Chung-Hong Lee, Chih-Hong Wu, Hsin-Chang Yang.
A Platform Framework for Cross-lingual Text Relatedness Evaluation and Plagiarism Detection.
The 3rd Intetnational Conference on Innovative Computing Information and Control (ICICIC'08)
|
|
2008 |
|
|
David Pinto, Jorge Civera, Alberto Barrón-Cedeño, Alfons Juan, Paolo Rosso.
A statistical approach to crosslingual natural language tasks.
J. Algorithms doi:10.1016/j.jalgor.2009.02.005
|
- Solution to different tasks based on machine translation techniques
- Cross-language text similarity analysis based on statistical bilingual dictionaries
|
2009 |
|
|
Martin Potthast, Benno Stein, Maik Anderka. A Wikipedia-Based Multilingual Retrieval Model.
Macdonald, Ounis, Plachouras, Ruthven and White (editors) 30th European Conference on IR Research, ECIR 2008,
LNCS (4656), pp. 522-530 Glasgow
|
- Exploitation of comparable corpora (Wikipedia)
- CL analysis based on CL-Explicit Semantic Analysis
|
2008 |
|
|
Bruno Pouliquen, Ralf Steinberger, Camelia Ignat.
Automatic Identification of Document Translations in Large Multilingual Document Collections.
Angelova, Bontcheva, Mitkov, Nicolov, Nikolov (editors) Proceedings of the International Conference
Recent Advances in Natural Language Processing (RANLP’03), pp. 401-408.
|
- Cross-language retrieval of similar documents based on the EUROVOC thesaurus
|
2003 |
|
| Reference |
Topics |
Year |
Link |
|
JunPeng Bao, Caroline Lyon, Peter C. R. Lane, Wei Ji, James A. Malcolm.
Comparing Different Text Similarity Methods.
University of Hertfordshire
|
|
2007 |
|
|
Alberto Barrón-Cedeño, Paolo Rosso. On Automatic Plagiarism Detection based on n-grams Comparison.
Boughanem et al. (Eds.) ECIR 2009, LNCS 5478, pp. 696-700, Springer-Verlag Berlin Heidelberg
|
- Evaluation of n-gram based plagarism detection with the METER corpus
|
2009 |
|
|
Alberto Barrón-Cedeño, Paolo Rosso. Towards the Exploitation of Statistical Language Models
for Plagiarism Detection with Reference.
Stein, Koppel, and Stamatatos (editors) ECAI Workshop Uncovering on Plagiarism and Social Software Misuse (PAN 08),
pp. 15-19
|
- First essays in the exploitation of statistical language models and entropy for plagiarism detection
|
2008 |
|
|
Alberto Barrón-Cedeño, Paolo Rosso, José Miguel Benedí.
Reducing the Plagiarism Detection Search Space on the Basis of the Kullback-Leibler Distance.
Gelbukh A. (ed.) CICLing 2009, LNCS 5449, pp. 523-534 Springer-Verlag
|
- Pre-selection of potential source documents given a suspicious text
- Search space reduction based on the Kullback-Leibler distance
|
2009 |
|
|
Yaniv Bernstein, Justin Zobel. A Scalable System for Identifying Co-Derivative Documents.
String Processing and Information Retrieval, LNCS (3246), pp. 55-67.
|
|
2004 |
|
|
Sergey Brin, James Davis, Hector Garcia-Molina. Copy Detection Mechanisms for Digital Documents
Proceedings of the ACM SIGMOD Annual Conference, pp. 398-409
|
- Description of the COPS system
|
1995 |
|
|
Andrei Z. Broder. On the resemblance and containment of documents.
Compression and Complexity of Sequences (SEQUENCES’97).
|
|
1997 |
|
|
Zdenek Ceska. Plagiarism Detection based on Singular Value Decomposition.
Advances in Natural Language Processing, LNCS (5221), pp. 108-119, Springer Berlin / Heidelberg.
|
|
2008 |
|
|
Abdur Chowdhury, Ophir Frieder, David Grossman, Mary Catherine McCabe.
Collection Statistics for Fast Duplicate Document Detection.
ACM Transactions on Information Systems 20(2), pp. 171–191.
|
|
2002 |
|
|
Paul Clough, Robert Gaizauskas, Scott Piao. Building and annotating a corpus for the study of journalistic text reuse.
Proceedings of the 3rd International Conference on Language Resources and Evaluation (V) (LREC-02), Spain, pp. 1678-1691
|
- Generation of a journalistic corpus for text reuse analysis
|
2002 |
|
|
Paul Clough, Robert Gaizauskas, Scott Piao, Yorick Wilks. METER: Measuring Text Reuse.
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, pp. 152-159
|
|
2002 |
|
|
Timothy C. Hoad, Justin Zobel. Methods for Identifying Versioned and Plagiarised Documents.
Journal of the AMerican Society for Information Science and Technology 54, pp. 203-215
|
|
2003 |
|
|
Parvati Iyer, Abhipsita Singh. Document Similarity Analysis for a Plagiarism Detection System.
2nd Indian Int. Conf. on Artificial Intelligence (IICAI-2005), pp. 2534-2544
|
|
2005 |
|
|
NamOh Kang, Alexander Gelbukh, SangYong Han. PPChecker: Plagiarism Pattern Checker in Document Copy Detection.
Proc. TSD-2006: Text, Speech and Dialogue, LNAI (4188), pp. 661-667
|
- Vocabulary expansion based on Wordnet
- Distinction between plagiarism levels: copy, rewording, word insertion, word deletion
|
2006 |
|
|
Aleksander Kolcz, Abdur Chowdhury, Joshua Alspector.
Improved Robustness of Signature-Based Near-Replica Detection via Lexicon Randomization.
KDD-2004 (The Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining),
pp. 605-610, Seattle, WA
|
|
2004 |
|
|
Thomas Lancaster, Fintan Culwin. A visual argument for plagiarism detection using word pairs.
Proceedings of the Plagiarism Conference
|
|
2004 |
|
|
Caroline Lyon, Ruth Barrett, James Malcolm. A theoretical basis to the automated detection of copying between
texts, and its practical implementation in the Ferret plagiarism and collusion detector.
Plagiarism: Prevention, Practice and Policies Conference
|
- Description of the Ferret system
- Method based on the exhaustive comparison of word n-grams
|
2004 |
|
|
Caroline Lyon, James Malcolm, Bob Dickerson. Detecting short passages of similar text in large document collections.
Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 118-125
|
- Description of a method based on n-grams comparison
|
2001 |
|
|
Donald Metzler, Yaniv Bernstein, W. Bruce Croft, Alistair Moffat, Justin Zobel.
Similarity Measures for Tracking Information Flow.
Proc. CIKM’05
|
|
2005 |
|
|
Martin Potthast, Benno Stein. New Issues in Near-Duplicate Detection.
Preisach, Burkhardt, Schmidt-Thieme, Decker (editors) Data Analysis, Machine Learning and Applications, pp. 601-609
|
|
2008 |
|
|
Narayanan Shivakumar, Hector Garcia-Molina. Building a scalable and accurate copy detection mechanism.
Proceedings of 1st ACM Conference on Digital Libraries (DL'96), pp. 160 - 168
|
- Detection of copied sentences via hash tables
|
1996 |
|
|
Mark Sanderson. Duplicate detection in the Reuters collection.
Technical Report. Department of Computing Science, University of Glasgow
|
|
1997 |
|
|
Antonio Si, Hong Va Leong, Rynson W. H. Lau. CHECK: a document plagiarism detection system.
Proc. of the 1997 ACM Symposium on Applied Computing, San Jose, CA, pp. 70-77
|
|
1997 |
|
|
Daniel R. White, Mike S. Joy. Sentence-Based Natural Language Plagiarism Detection.
ACM Journal on Educational Resources in Computing 4(4)
|
|
2004 |
|
| Reference |
Topics |
Year |
Link |
|
Rosa Maria Coyotl-Morales, Luis Villaseñor-Pineda, Manuel Montes-y-Gómez, Paolo Rosso.
Authorship attribution using Word Sequences.
Proc. of the 11th Iberoamerican Congress on Pattern Recognition, (CIARP 2006), LNCS (4225), pp. 844-853.
|
- Method based on Maximal Word Sequences comparison
|
2006 |
|
|
Ol'ga Feiguina, Graeme Hirst. Authorship attribution for small texts: Literary and forensic experiments.
Stein, Koppel, and Stamatatos (editors) SIGIR Workshop on Plagiarism Analysis, Authorship Identification, and Near-Duplicate Detection (PAN 07)
|
|
2007 |
|
|
Marco Kimler. Using Style Markers for Detecting Plagiarism in Natural Language Documents.
Thesis
|
|
2003 |
|
|
Kim Luyckx,Walter Daelemans. Personae: a corpus for author and personality prediction from text.
Proceedings of LREC-2008, Sixth International Language Resources and Evaluation Conference.
|
|
2008 |
|
|
Sven Meyer zu Eissen, Benno Stein. Intrinsic Plagiarism Detection.
Lalmas et. al. (Eds.): Advances in Information Retrieval Proc. of the 28th European Conf. on IR
research, ECIR 2006, pp. 565-569. London
|
- Stylometric analysis of text
- Statistical text features
- Intrinsic plagiarism analysis
|
2006 |
|
|
Sven Meyer zu Eissen, Benno Stein, Marion Kulig. Plagiarism Detection without Reference Collections
In: Reinhold Decker and Hans J. Lenz (editors) Advances in Data Analysis, pp. 359-366
|
- Classification of plagiarism analysis methods
- Nice introduction to intrinsic plagiarism analysis
- Description of corpus building process
|
2006 |
|
|
Efstathios Stamatatos, Nikos Fakotakis, George Kokkinakis. Automatic Text Categorization in Terms of Genre and Author.
Computational Linguistics 26, pp. 471-495
|
- Large overview on authorship analysis
|
2000 |
|
|
Efstathios Stamatatos, Nikos Fakotakis, George Kokkinakis.
Computer-Based Authorship AttributionWithout Lexical Measures
Computers and the Humanities 35(2)
|
|
2001 |
|
|
Benno Stein, Nedim Lipka, Sven Meyer zu Eissen. Meta Analysis within Authorship Verification.
19th International Conference on Database and Expert Systems Application, DEXA 2008, pp. 34-39
|
|
2008 |
|
|
Benno Stein, Sven Meyer zu Eissen. Intrinsic Plagiarism Analysis with Meta Learning.
Stein, Koppel, and Stamatatos (editors) SIGIR Workshop on Plagiarism Analysis, Authorship Identification, and Near-Duplicate Detection (PAN 07), pp. 45-50
|
|
2007 |
|
|
Yuta Tsuboi, Yuji Matsumoto. Authorship identification for heterogeneous documents.
IPSJ SIG Notes
|
- Authorship attribution of mails and web pages in japanesse
- Exploitation of sequential word patterns for SVM-based classification
|
2002 |
|
| Reference |
Topics |
Year |
Link |
|
Christian Arwin, S.M.M. Tahaghoghi. Plagiarism Detection across Programming Languages
Twenty-Ninth Australasian Computer Science Conference (ACSC2006)
|
- "Cross-Language" source code plagiarism analysis
|
2006 |
|
|
Francisco Rosales, Antonio García, Saltiago Rodríguez, José L. Pedraza, Rafael Méndez, Manuel M. Nieto
Detection of Plagiarism in Programming Assignments
IEEE Transaction on Education 51(2), pp. 174-183
|
- Description of pk2, a tool developed at the UPM, that detects cases of plagiarism in high and low level programming
|
2008 |
|
|
Sam Grier. A Tool that Detects Plagiarism in Pascal Programs
Proceedings of the twelfth SIGCSE technical symposium on Computer science education, pp. 15-20
|
|
1981 |
|
|
Mike Joy, Michael Luck. Plagiarism in Programming Assignments.
IEEE TRANSACTIONS ON EDUCATION 42(2), pp. 129-133.
|
|
1999 |
|
|
Thomas Lancaster, Mark Tetlow.
Does automated anti-plagiarism have to be complex? Evaluating more appropriate software metrics for
finding collusion.
Proceedings of the 2005 Ascilite Conference.
|
|
2005 |
|
|
Karl J. Ottenstein. An Algorithm Approach to the Detection and Prevention of Plagiarism.
ACM SIGCSE Bulleton 8(4), pp. 30-41.
|
|
1976 |
|
|
Alan Parker. James O. Hamblen. Computer Algorithms for Plagiarism Detection.
IEEE Transactions on Education 32, pp. 94-99
|
- Survey of algorithms for the detection of source code plagiarism
|
1989 |
|
|
Kristina L. Verco, Michael J. Wise.
Software for Detecting Suspected Plagiarism: Comparing Structure and Attribute-Counting Systems.
First Australian Conference on Computer Science Education, Sydney, Australia
|
|
1996 |
|