survey

Open Access

Academic Plagiarism Detection: A Systematic Literature Review

Authors:
Tomáš Foltýnek

Department of Informatics, Mendel University in Brno, Czechia and University of Wuppertal, Germany

Department of Informatics, Mendel University in Brno, Czechia and University of Wuppertal, Germany

0000-0001-8412-5553
View Profile

,
Norman Meuschke

University of Wuppertal, Germany and University of Konstanz, Germany

University of Wuppertal, Germany and University of Konstanz, Germany
View Profile

,
Bela Gipp

University of Wuppertal, Germany and University of Konstanz, Germany

University of Wuppertal, Germany and University of Konstanz, Germany
View Profile

Authors Info & Claims

ACM Computing Surveys Volume 52 Issue 6Article No.: 112pp 1–42https://doi.org/10.1145/3345317

Published:16 October 2019Publication History

Get Citation Alerts

New Citation Alert added!

This alert has been successfully added and will be sent to:

You will be notified whenever a record that you have chosen has been cited.

To manage your alert preferences, click on the button below.
Manage my Alerts

New Citation Alert!

Please log in to your account
Publisher Site

ACM Computing Surveys

Abstract

This article summarizes the research on computational methods to detect academic plagiarism by systematically reviewing 239 research papers published between 2013 and 2018. To structure the presentation of the research contributions, we propose novel technically oriented typologies for plagiarism prevention and detection efforts, the forms of academic plagiarism, and computational plagiarism detection methods. We show that academic plagiarism detection is a highly active research field. Over the period we review, the field has seen major advances regarding the automated detection of strongly obfuscated and thus hard-to-identify forms of academic plagiarism. These improvements mainly originate from better semantic text analysis methods, the investigation of non-textual content features, and the application of machine learning. We identify a research gap in the lack of methodologically thorough performance evaluations of plagiarism detection systems. Concluding from our analysis, we see the integration of heterogeneous analysis methods for textual and non-textual content features using machine learning as the most promising area for future research contributions to improve the detection of academic plagiarism further.

References

Assad Abbas, Limin Zhang, and Samee U. Khan. 2014. A literature review on the state-of-the-art in patent analysis. World Pat. Inf. 37 (2014), 3–13. DOI:10.1016/j.wpi.2013.12.006Google ScholarCross Ref
Asad Abdi, Norisma Idris, Rasim M. Alguliyev, and Ramiz M. Aliguliyev. 2015. PDLK: Plagiarism detection using linguistic knowledge. Expert Syst. Appl. 42, 22 (2015), 8936--8946. DOI:10.1016/j.eswa.2015.07.048Google ScholarDigital Library
Samira Abnar, Mostafa Dehghani, Hamed Zamani, and Azadeh Shakery. 2014. Expanded n-grams for semantic text alignment—Notebook for PAN at CLEF 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14).Google Scholar
Sadia Afroz, Aylin Caliskan Islam, Ariel Stolerman, Rachel Greenstadt, and Damon McCoy. 2014. Doppelgänger finder: Taking stylometry to the underground. In Proceedings of the 2014 IEEE Symposium on Security and Privacy. 212--226.Google ScholarDigital Library
Naveed Afzal, Yanshan Wang, and Hongfang Liu. 2016. MayoNLP at SemEval-2016 Task 1: Semantic textual similarity based on lexical semantic net and deep learning semantic model. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval’16). 674--679.Google ScholarCross Ref
Basant Agarwal, Heri Ramampiaro, Helge Langseth, and Massimiliano Ruocco. 2018. A deep network model for paraphrase detection in short text messages. Inf. Process. Manag. 54, 6 (2018), 922--937. DOI:10.1016/j.ipm.2018.06.005Google ScholarCross Ref
Eneko Agirre, Carmen Banea, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, Rada Mihalcea, German Rigau, and Janyce Wiebe. 2016. Semeval-2016 task 1: Semantic textual similarity, monolingual and cross-lingual evaluation. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval’16). 497--511.Google ScholarCross Ref
Mayank Agrawal and Dilip Kumar Sharma. 2016. A state of art on source code plagiarism detection. In Proceedings of the 2016 2nd International Conference on Next Generation Computing Technologies (NGCT’16). 236--241. DOI:10.1109/NGCT.2016.7877421Google ScholarCross Ref
Mohammad Al-Smadi, Zain Jaradat, Mahmoud Al-Ayyoub, and Yaser Jararweh. 2017. Paraphrase identification and semantic text similarity analysis in arabic news tweets using lexical, syntactic, and semantic features. Inf. Process. Manag. 53, 3 (2017), 640--652. DOI:10.1016/j.ipm.2017.01.002Google ScholarDigital Library
Houda Alberts. 2017. Author clustering with the aid of a simple distance measure—Notebook for PAN at CLEF 2017. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’17).Google Scholar
Hanan Aldarmaki and Mona Diab. 2016. GWU NLP at SemEval-2016 Shared Task 1: Matrix factorization for crosslingual STS. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval’16). 663--667.Google ScholarCross Ref
Mahmoud Alewiwi, Cengiz Orencik, and Erkay Savas. 2016. Efficient top-k similarity document search utilizing distributed file systems and cosine similarity. Cluster Comput. 19, 1 (2016), 109--126. DOI:10.1007/s10586-015-0506-0Google ScholarDigital Library
Zakiy Firdaus Alfikri and Ayu Purwarianti. 2014. Detailed analysis of extrinsic plagiarism detection system using machine learning approach (naive bayes and svm). Indones. J. Electr. Eng. Comput. Sci. 12, 11 (2014), 7884--7894.Google Scholar
Muna Alsallal, Rahat Iqbal, Saad Amin, and Anne James. 2013. Intrinsic plagiarism detection using latent semantic indexing and stylometry. In Proceedings of the 2013 6th International Conference on Developments in eSystems Engineering. 145--150. DOI:10.1109/DeSE.2013.34Google ScholarDigital Library
Muna AlSallal, Rahat Iqbal, Saad Amin, Anne James, and Vasile Palade. 2016. An integrated machine learning approach for extrinsic plagiarism detection. In Proceedings of the 2016 9th International Conference on Developments in eSystems Engineering (DeSE’16). 203--208. DOI:10.1109/DeSE.2016.1Google ScholarCross Ref
Muna AlSallal, Rahat Iqbal, Vasile Palade, Saad Amin, and Victor Chang. 2019. An integrated approach for intrinsic plagiarism detection. Fut. Gener. Comput. Syst. 96 (2019), 700--712. DOI:10.1016/j.future.2017.11.023Google ScholarDigital Library
Miguel A. Álvarez-Carmona, Marc Franco-Salvador, Esaú Villatoro-Tello, Manuel Montes-y-Gómez, Paolo Rosso, and Luis Villaseñor-Pineda. 2018. Semantically-informed distance and similarity measures for paraphrase plagiarism identification. J. Intell. Fuzzy Syst. 34, 5 (2018), 2983--2990.Google ScholarCross Ref
Faisal Alvi, Mark Stevenson, and Paul Clough. 2014. Hashing and merging heuristics for text reuse detection. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14). 939--946.Google Scholar
Faisal Alvi, Mark Stevenson, and Paul Clough. 2017. Plagiarism detection in texts obfuscated with homoglyphs. In Advances in Information Retrieval. 669--675.Google Scholar
Salha Alzahrani. 2015. Arabic plagiarism detection using word correlation in N-Grams with K-Overlapping approach—Working notes for PAN-AraPlagDet at FIRE 2015. In Proceedings of the Forum for Information Retrieval Evaluation (FIRE’15).Google Scholar
Salha M. Alzahrani, Naomie Salim, and Ajith Abraham. 2012. Understanding plagiarism linguistic patterns, textual features, and detection methods. IEEE Trans. Syst. Man, Cybern. C Appl. Rev. 42, 2 (2012), 133--149.Google ScholarDigital Library
Habibollah Asghari, Salar Mohtaj, Omid Fatemi, Heshaam Faili, Paolo Rosso, and Martin Potthast. 2016. Algorithms and corpora for persian plagiarism detection. In Proceedings of the Forum for Information Retrieval Evaluation (FIRE’16). 61.Google Scholar
Duygu Ataman, Jose G. C. De Souza, Marco Turchi, and Matteo Negri. 2016. FBK HLT-MT at SemEval-2016 Task 1: Cross-lingual semantic similarity measurement using quality estimation features and compositional bilingual word embeddings. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval’16). 570--576.Google ScholarCross Ref
Farzindar Atefeh and Wael Khreich. 2015. A survey of techniques for event detection in twitter. Comput. Intell. 31, 1 (2015), 132--164. DOI:10.1111/coin.12017Google ScholarDigital Library
Douglas Bagnall. 2015. Author identification using multi-headed recurrent neural networks—Notebook for PAN at CLEF 2015. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’15).Google Scholar
Douglas Bagnall. 2016. Authorship clustering using multi-headed recurrent neural networks—Notebook for PAN at CLEF 2016. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’16).Google Scholar
Marco Baroni, Georgiana Dinu, and Germán Kruszewski. 2014. Don't count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 238--247.Google ScholarCross Ref
Alberto Barrón-Cedeño, Parth Gupta, and Paolo Rosso. 2013. Methods for cross-language plagiarism detection. Knowl.-Based Syst. 50 (2013), 211--217. DOI:10.1016/j.knosys.2013.06.018Google ScholarCross Ref
Alberto Barrón-Cedeño, Marta Vila, M. Antònia Martí, and Paolo Rosso. 2013. Plagiarism meets paraphrasing: insights for the next generation in automatic plagiarism detection. Comput. Linguist. 39, 4 (2013), 917--947. DOI:10.1162/COLI_a_00153Google ScholarDigital Library
Alberto Bartoli, Alex Dagri, Andrea De Lorenzo, Eric Medvet, and Fabiano Tarlao. 2015. An author verification approach based on differential features—Notebook for PAN at CLEF 2015. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’15).Google Scholar
Jeffrey Beall. 2016. Best practices for scholarly authors in the age of predatory journals. Ann. R. Coll. Surg. Engl. 98, 2 (2016), 77--79.Google ScholarCross Ref
Imene Bensalem, Imene Boukhalfa, Paolo Rosso, Lahsen Abouenour, Kareem Darwish, and Salim Chikhi. 2015. Overview of the AraPlagDet PAN@FIRE2015 shared task on arabic plagiarism detection. In Proceedings of the Forum for Information Retrieval Evaluation (FIRE’15).Google Scholar
Imene Bensalem, Salim Chikhi, and Paolo Rosso. 2013. Building arabic corpora from wikisource. In Proceedings of the 2013 ACS International Conference on Computer Systems and Applications (AICCSA’13). 1--2. DOI:10.1109/AICCSA.2013.6616474Google ScholarCross Ref
Imene Bensalem, Paolo Rosso, and Salim Chikhi. 2013. A new corpus for the evaluation of arabic intrinsic plagiarism detection. In Information Access Evaluation: Multilinguality, Multimodality, and Visualization. 53--58.Google Scholar
Imene Bensalem, Paolo Rosso, and Salim Chikhi. 2014. Intrinsic plagiarism detection using n-gram classes. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1459--1464.Google ScholarCross Ref
Ergun Bicici. 2016. RTM at SemEval-2016 Task 1: Predicting semantic similarity with referential translation machines and related statistics. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval’16). 758--764.Google ScholarCross Ref
Victoria Bobicev. 2013. Authorship detection with PPM—Notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Hadj Ahmed Bouarara, Amine Rahmani, Reda Mohamed Hamou, and Abdelmalek Amine. 2014. Machine learning tool and meta-heuristic based on genetic algorithms for plagiarism detection over mail service. In Proceedings of the 2014 IEEE/ACIS 13th International Conference on Computer and Information Science (ICIS’14). 157--162. DOI:10.1109/ICIS.2014.6912125Google ScholarCross Ref
Barry Bozeman, Daniel Fay, and Catherine P. Slade. 2013. Research collaboration in universities and academic entrepreneurship: The-state-of-the-art. J. Technol. Transf. 38, 1 (2013), 1--67. DOI:10.1007/s10961-012-9281-8Google ScholarCross Ref
Pearl Brereton, Barbara A. Kitchenham, David Budgen, Mark Turner, and Mohamed Khalil. 2007. Lessons from applying the systematic literature review process within the software engineering domain. J. Syst. Softw. 80, 4 (2007), 571--583. DOI:10.1016/j.jss.2006.07.009Google ScholarDigital Library
Tomáš Brychcín and Lukáš Svoboda. 2016. UWB at SemEval-2016 Task 1: Semantic textual similarity using lexical, syntactic, and semantic information. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval’16). 588--594.Google ScholarCross Ref
Davide Buscaldi, Joseph Le Roux, Jorge J. García Flores, and Adrian Popescu. 2013. LIPN-CORE: Semantic text similarity using n-grams, wordnet, syntactic analysis, ESA and information retrieval based features. In Proceedings of the 2nd Joint Conference on Lexical and Computational Semantics. 63.Google Scholar
Esteban Castillo, Ofelia Cervantes, Darnes Vilariño, David Pinto, and Saul León. 2014. Unsupervised method for the authorship identification task—Notebook for PAN at CLEF 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14).Google Scholar
Daniel Castro, Yaritza Adame, María Pelaez, and Rafael Muñoz. 2015. Authorship verification, combining linguistic features and different similarity functions—Notebook for PAN at CLEF 2015. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’15).Google Scholar
Daniele Cerra, Mihai Datcu, and Peter Reinartz. 2014. Authorship analysis based on data compression. Pattern Recogn. Lett. 42 (2014), 79--84. DOI:10.1016/j.patrec.2014.01.019Google ScholarCross Ref
Zdenek Ceska. 2008. Plagiarism detection based on singular value decomposition. In Advances in Natural Language Processing. Springer, 108--119.Google Scholar
Man Yan Miranda Chong. 2013. A study on plagiarism detection and plagiarism direction identification using natural language processing techniques. Ph. D Thesis. University of Wolverhampton.Google Scholar
Hussain A. Chowdhury and Dhruba K. Bhattacharyya. 2016. Plagiarism: Taxonomy, tools and detection techniques. In Proceedings of the 19th National Convention on Knowledge, Library and Information Networking (NACLIN’16).Google Scholar
Daniela Chudá, Jozef Lačný, Maroš Maršalek, Pavel Michalko, and Ján Súkeník. 2013. Plagiarism detection in slovak texts on the web. In Proceedings of the Conference on Plagiarism across Europe and Beyond. 249--260.Google Scholar
Guy J. Curtis and Joseph Clare. 2017. How prevalent is contract cheating and to what extent are students repeat offenders? J. Acad. Ethics 15, 2 (2017), 115--124. DOI:10.1007/s10805-017-9278-xGoogle ScholarCross Ref
Guy J. Curtis and Lucia Vardanega. 2016. Is plagiarism changing over time? A 10-year time-lag study with three points of measurement. High. Educ. Res. Dev. 35, 6 (2016), 1167--1179. DOI:10.1080/07294360.2016.1161602Google ScholarCross Ref
Michiel van Dam. 2013. A basic character n-gram approach to authorship verification—Notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Avishek Dan and Pushpak Bhattacharyya. 2013. Cfilt-core: Semantic textual similarity using universal networking language. In Proceedings of the 2nd Joint Conference on Lexical and Computational Semantics (*SEM’13). 216--220.Google Scholar
Ali Daud, Wahab Khan, and Dunren Che. 2017. Urdu language processing: a survey. Artif. Intell. Rev. 47, 3 (2017), 279--311. DOI:10.1007/s10462-016-9482-xGoogle ScholarDigital Library
Scott Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, and Richard A. Harshman. 1990. Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41, 6 (1990), 391. DOI:10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9Google ScholarCross Ref
T. Dharani and I. Laurence Aroquiaraj. 2013. A survey on content based image retrieval. In Proceedings of the 2013 International Conference on Pattern Recognition, Informatics and Mobile Engineering. 485--490. DOI:10.1109/ICPRIME.2013.6496719Google Scholar
Michal Ďuračík, Emil Kršák, and Patrik Hrkút. 2017. Current trends in source code analysis, plagiarism detection and issues of analysis big datasets. Proc. Eng. 192 (2017), 136--141. DOI:10.1016/j.proeng.2017.06.024Google ScholarCross Ref
Nava Ehsan and Azadeh Shakery. 2016. Candidate document retrieval for cross-lingual plagiarism detection using two-level proximity information. Inf. Process. Manag. 52, 6 (2016), 1004--1017. DOI:10.1016/j.ipm.2016.04.006Google ScholarDigital Library
Nava Ehsan and Azadeh Shakery. 2016. A pairwise document analysis approach for monolingual plagiarism detection. In Proceedings of the Forum for Information Retrieval Evaluation (FIRE’16). 145--148.Google Scholar
Nava Ehsan, Frank Wm. Tompa, and Azadeh Shakery. 2016. Using a dictionary and n-gram alignment to improve fine-grained cross-language plagiarism detection. In Proceedings of the 2016 ACM Symposium on Document Engineering (DocEng’16). 59--68. DOI:10.1145/2960811.2960817Google ScholarDigital Library
Taiseer Abdalla Elfadil Eisa, Naomie Salim, and Salha Alzahrani. 2015. Existing plagiarism detection techniques: A systematic mapping of the scholarly literature. Online Inf. Rev. 39, 3 (2015), 383--400.Google ScholarCross Ref
El-Sayed M. El-Alfy, Radwan E. Abdel-Aal, Wasfi G. Al-Khatib, and Faisal Alvi. 2015. Boosting paraphrase detection through textual similarity metrics with abductive networks. Appl. Soft Comput. 26, (2015), 444--453. DOI:10.1016/j.asoc.2014.10.021Google ScholarDigital Library
Victoria Elizalde. 2013. Using statistic and semantic analysis to detect plagiarism—Notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Victoria Elizalde. 2014. Using noun phrases and tf-idf for plagiarized document retrieval—Notebook for PAN at CLEF 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14).Google Scholar
Erik von Elm, Greta Poglia, Bernhard Walder, and Martin R. Tramèr. 2004. Different patterns of duplicate publication: An Analysis of articles used in systematic reviews. JAMA 291, 8 (2004), 974--980. DOI:10.1001/jama.291.8.974Google ScholarCross Ref
Fezeh Esteki and Faramarz Safi Esfahani. 2016. A plagiarism detection approach based on SVM for persian texts. In Proceedings of the Forum for Information Retrieval Evaluation (FIRE’16). 149--153.Google Scholar
Asli Eyecioglu and Bill Keller. 2015. Twitter paraphrase identification with simple overlap features and SVMs. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval’15). 64--69.Google ScholarCross Ref
Jody Condit Fagan. 2017. An evidence-based review of academic web search engines, 2014--2016: Implications for librarians’ practice and research agenda. Inf. Technol. Libr. 36, 2 (2017), 7.Google Scholar
Christiane Fellbaum. 1998. WordNet: An Electronic Lexical Database (Language, Speech, and Communication). The MIT Press.Google Scholar
Vanessa Wei Feng and Graeme Hirst. 2013. Authorship verification with entity coherence and other rich linguistic features—Notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Rafael Ferreira, George D. C. Cavalcanti, Fred Freitas, Rafael Dueire Lins, Steven J. Simske, and Marcelo Riss. 2018. Combining sentence similarities measures to identify paraphrases. Comput. Speech Lang. 47 (2018), 59--73. DOI:10.1016/j.csl.2017.07.002Google ScholarDigital Library
Jérémy Ferrero, Frederic Agnes, Laurent Besacier, and Didier Schwab. 2017. CompiLIG at SemEval-2017 Task 1: Cross-language plagiarism detection methods for semantic textual similarity. arXiv:1704.01346.Google Scholar
Jérémy Ferrero, Frédéric Agnes, Laurent Besacier, and Didier Schwab. 2017. Using word embedding for cross-language plagiarism detection. arXiv:1702.03082.Google Scholar
Jérémy Ferrero, Laurent Besacier, Didier Schwab, and Frédéric Agnes. 2017. Deep investigation of cross-language plagiarism detection methods. arXiv:1705.08828.Google Scholar
Tomáš Foltýnek and Irene Glendinning. 2015. Impact of policies for plagiarism in higher education across europe: Results of the project. Acta Univ. Agric. Silvic. Mendel. Brun. 63, 1 (2015), 207--216.Google ScholarCross Ref
Marc Franco-Salvador, Parth Gupta, and Paolo Rosso. 2013. Cross-language plagiarism detection using a multilingual semantic network. In Advances in Information Retrieval. 710--713.Google Scholar
Marc Franco-Salvador, Parth Gupta, and Paolo Rosso. 2014. Knowledge graphs as context models: Improving the detection of cross-language plagiarism with paraphrasing. In Bridging Between Information Retrieval and Databases: PROMISE Winter School 2013, Nicola Ferro (ed.). Springer-Verlag, Berlin, 227--236. DOI:10.1007/978-3-642-54798-0_12Google Scholar
Marc Franco-Salvador, Parth Gupta, Paolo Rosso, and Rafael E. Banchs. 2016. Cross-language plagiarism detection over continuous-space- and knowledge graph-based representations of language. Knowl.-Based Syst. 111 (2016), 87--99. DOI:10.1016/j.knosys.2016.08.004Google ScholarDigital Library
Marc Franco-Salvador, Paolo Rosso, and Manuel Montes-y-Gómez. 2016. A systematic study of knowledge graph analysis for cross-language plagiarism detection. Inf. Process. Manag. 52, 4 (2016), 550--570. DOI:10.1016/j.ipm.2015.12.004Google ScholarDigital Library
Marc Franco-Salvador, Paolo Rosso, and Roberto Navigli. 2014. A knowledge-based representation for cross-language document retrieval and categorization. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics. 414--423.Google ScholarCross Ref
Jordan Fréry, Christine Largeron, and Mihaela Juganaru-Mathieu. 2014. UJM at CLEF in Author Identification—Notebook for PAN at CLEF 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14).Google Scholar
Evgeniy Gabrilovich and Shaul Markovitch. 2007. Computing semantic relatedness using wikipedia-based explicit semantic analysis. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’07). 1606--1611.Google Scholar
Jean-Gabriel Ganascia, Peirre Glaudes, and Andrea Del Lungo. 2014. Automatic detection of reuses and citations in literary texts. Lit. Linguist. Comput. 29, 3 (2014), 412--421. DOI:10.1093/llc/fqu020Google ScholarCross Ref
Yasmany García-Mondeja, Daniel Castro-Castro, Vania Lavielle-Castro, and Rafael Muñoz. 2017. Discovering author groups using a b-compact graph-based clustering—notebook for PAN at CLEF 2017. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’17).Google Scholar
Urvashi Garg and Vishal Goyal. 2016. Maulik: A plagiarism detection tool for hindi documents. Ind. J. Sci. Technol. 9, 12 (2016).Google Scholar
Shahabeddin Geravand and Mahmood Ahmadi. 2014. An efficient and scalable plagiarism checking system using bloom filters. Comput. Electr. Eng. 40, 6 (2014), 1789--1800.Google ScholarDigital Library
M. R. Ghaeini. 2013. Intrinsic author identification using modified weighted KNN—Notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Erfaneh Gharavi, Kayvan Bijari, Kiarash Zahirnia, and Hadi Veisi. 2016. A deep learning approach to persian plagiarism detection. In Proceedings of the Forum for Information Retrieval Evaluation (FIRE’16). 154--159.Google Scholar
Lee Gillam. 2013. Guess again and see if they line up: Surrey's runs at plagiarism detection—Notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Bela Gipp. 2014. Citation-based Plagiarism Detection -Detecting Disguised and Cross-language Plagiarism Using Citation Pattern Analysis. Springer Vieweg Research. Retrieved from http://www.springer.com/978-3-658-06393-1.Google Scholar
Bela Gipp and Norman Meuschke. 2011. Citation pattern matching algorithms for citation-based plagiarism detection: Greedy citation tiling, citation chunking and longest common citation sequence. In Proceedings of the 11th ACM Symposium on Document Engineering. 249--258. DOI:10.1145/2034691.2034741Google ScholarDigital Library
Bela Gipp, Norman Meuschke, and Joeran Beel. 2011. Comparative evaluation of text- and citation-based plagiarism detection approaches using guttenplag. In Proceedings of 11th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL’11). 255--258. DOI:10.1145/1998076.1998124Google ScholarDigital Library
Bela Gipp, Norman Meuschke, and Corinna Breitinger. 2014. Citation‐based plagiarism detection: Practicability on a large‐scale scientific corpus. J. Assoc. Inf. Sci. Technol. 65, 8 (2014), 1527--1540. DOI:10.1002/asi.23228Google ScholarDigital Library
Bela Gipp, Norman Meuschke, Corinna Breitinger, Jim Pitman, and Andreas Nürnberger. 2014. Web-based demonstration of semantic similarity detection using citation pattern visualization for a cross language plagiarism case. In Proceedings of the International Conference on Enterprise Information Systems (ICEIS’14). 677--683. DOI:10.5220/0004985406770683Google ScholarDigital Library
Goran Glavaš, Marc Franco-Salvador, Simone P. Ponzetto, and Paolo Rosso. 2018. A resource-light method for cross-lingual semantic textual similarity. Knowl.-Based Syst. 143 (2018), 1--9. DOI:10.1016/j.knosys.2017.11.041Google ScholarCross Ref
Lila Gleitman and Anna Papafragou. 2005. Language and thought. In The Cambridge Handbook of Thinking and Reasoning, Keith J. Holyoak and Robert G. Morrison (eds.). Cambridge University Press, 633--661.Google Scholar
Demetrios G. Glinos. 2014. A hybrid architecture for plagiarism detection—Notebook for PAN at CLEF 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14). 958--965.Google Scholar
Helena Gómez-Adorno, Yuridiana Alemán, Darnes Vilariño Ayala, Miguel A Sanchez-Perez, David Pinto, and Grigori Sidorov. 2017. Author clustering using hierarchical clustering analysis—Notebook for PAN at CLEF 2017. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’17).Google Scholar
Helena Gómez-Adorno, Grigori Sidorov, David Pinto, and Ilia Markov. 2015. A graph based authorship identification approach—Notebook for PAN at CLEF 2015. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’15).Google Scholar
Philipp Gross and Pashutan Modaresi. 2014. Plagiarism alignment detection by merging context seeds—Notebook for PAN at CLEF 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14).Google Scholar
Deepa Gupta, Vani Kanjirangat, and L. M. Leema. 2016. Plagiarism detection in text documents using sentence bounded stop word n-grams. J. Eng. Sci. Technol. 11, 10 (2016), 1403--1420.Google Scholar
Deepa Gupta, Vani Kanjirangat, and Charan Kamal Singh. 2014. Using natural language processing techniques and fuzzy-semantic similarity for automatic external plagiarism detection. In Proceedings of the 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI’14). 2694--2699. DOI:10.1109/ICACCI.2014.6968314Google ScholarCross Ref
Josue Gutierrez, Jose Casillas, Paola Ledesma, Gibran Fuentes, and Ivan Meza. 2015. Homotopy based classification for author verification task—Notebook for PAN at CLEF 2015. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’15).Google Scholar
Yaakov HaCohen-Kerner and Aharon Tayeb. 2017. Rapid detection of similar peer-reviewed scientific papers via constant number of randomized fingerprints. Inf. Process. Manag. 53, 1 (2017), 70--86. DOI:10.1016/j.ipm.2016.06.007Google ScholarDigital Library
Matthias Hagen, Martin Potthast, and Benno Stein. 2015. Source retrieval for plagiarism detection from large web corpora: Recent approaches. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’15).Google Scholar
Osama Haggag and Samhaa Smhaa El-Beltagy. 2013. Plagiarism candidate retrieval using selective query formulation and discriminative query scoring. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Oren Halvani and Lukas Graner. 2017. Author clustering based on compression-based dissimilarity scores—notebook for PAN at CLEF 2017. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’17).Google Scholar
Oren Halvani and Martin Steinebach. 2014. VEBAV - A simple, scalable and fast authorship verification scheme—notebook for PAN at CLEF 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14).Google Scholar
Oren Halvani, Martin Steinebach, and Ralf Zimmermann. 2013. Authorship verification via k-nearest neighbor estimation—Notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Oren Halvani and Christian Winter. 2015. A generic authorship verification scheme based on equal error rates—notebook for PAN at CLEF 2015. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’15).Google Scholar
Christian Hänig, Robert Remus, and Xose De La Puente. 2015. Exb themis: Extensive feature extraction from word alignments for semantic textual similarity. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval’15). 264--268.Google ScholarCross Ref
Sarah Harvey. 2014. Author verification using PPM with parts of speech tagging—Notebook for PAN at CLEF 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14).Google Scholar
Hua He, John Wieting, Kevin Gimpel, Jinfeng Rao, and Jimmy Lin. 2016. UMD-TTIC-UW at SemEval-2016 Task 1: Attention-based multi-perspective convolutional neural networks for textual similarity measurement. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval’16). 1103--1108.Google ScholarCross Ref
Oumaima Hourrane and El Habib Benlahmar. 2017. Survey of plagiarism detection approaches and big data techniques related to plagiarism candidate retrieval. In Proceedings of the 2nd International Conference on Big Data, Cloud and Applications (BDCA’17). 15:1--15:6. DOI:10.1145/3090354.3090369Google ScholarDigital Library
Manuela Hürlimann, Benno Weck, Esther van denBerg, Simon Šuster, and Malvina Nissim. 2015. GLAD: Groningen lightweight authorship detection—Notebook for PAN at CLEF 2015. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’15).Google Scholar
Syed Fawad Hussain and Asif Suryani. 2015. On retrieving intelligently plagiarized documents using semantic similarity. Eng. Appl. Artif. Intell. 45 (2015), 246--258. DOI:10.1016/j.engappai.2015.07.011Google ScholarDigital Library
Ashraf S. Hussein. 2015. A plagiarism detection system for arabic documents. In Intelligent Systems 2014, D. Filev, J. Jabłkowski, J. Kacprzyk, M. Krawczak, I. Popchev, L. Rutkowski, V. Sgurev, E. Sotirova, P. Szynkarczyk, and S. Zadrozny (Eds.). Springer International Publishing, 541--552.Google Scholar
Ashraf S. Hussein. 2015. Arabic document similarity analysis using n-grams and singular value decomposition. In Proceedings of the 2015 IEEE 9th International Conference on Research Challenges in Information Science (RCIS’15). 445--455. DOI:10.1109/RCIS.2015.7128906Google ScholarCross Ref
Radu Tudor Ionescu, Marius Popescu, and Aoife Cahill. 2014. Can characters reveal your native language? A language-independent approach to native language identification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1363--1373.Google ScholarCross Ref
Hideo Itoh. 2016. RICOH at SemEval-2016 Task 1: IR-based semantic textual similarity estimation. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval’16). 691--695.Google ScholarCross Ref
Magdalena Jankowska, Vlado Kešelj, and and Evangelos Milios. 2013. Proximity based one-class classification with common n-gram dissimilarity for authorship verification task—Notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Magdalena Jankowska, Vlado Kešelj, and Evangelos Milios. 2014. Ensembles of proximity-based one-class classifiers for author verification—Notebook for PAN at CLEF 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14).Google Scholar
Arun Jayapal and Binayak Goswami. 2013. Vector space model and overlap metric for author identification—Notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Zhuoren Jiang, Miao Chen, and Xiaozhong Liu. 2014. Semantic annotation with rescoredESA: Rescoring concept features generated from explicit semantic analysis. In Proceedings of the 7th International Workshop on Exploiting Semantic Annotations in Information Retrieval (ESAIR’14). 25--27. DOI:10.1145/2663712.2666192Google ScholarDigital Library
M. A. C. Jiffriya, M. A. C. Akmal Jahan, and Roshan G. Ragel. 2014. Plagiarism detection on electronic text based assignments using vector space model. In Proceedings of the 7th International Conference on Information and Automation for Sustainability. 1--5. DOI:10.1109/ICIAFS.2014.7069593Google Scholar
M. A. C. Jiffriya, M. A. C. Akmal Jahan, Roshan G. Ragel, and Sampath Deegalla. 2013. AntiPlag: Plagiarism detection on electronic submissions of text based assignments. In Proceedings of the 2013 IEEE 8th International Conference on Industrial and Information Systems. 376--380. DOI:10.1109/ICIInfS.2013.6732013Google ScholarCross Ref
Patrick Juola. 2017. Detecting contract cheating via stylometric methods. In Proceedings on the Conference on Plagiarism across Europe and Beyond. 187--198. Retrieved from https://plagiarism.pefka.mendelu.cz/files/proceedings17.pdf.Google Scholar
Patrick Juola and Efstathios Stamatatos. 2013. Overview of the author identification task at PAN 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Rune Borge Kalleberg. 2015. Towards detecting textual plagiarism using machine learning methods. University of Agder. Retrieved from https://brage.bibsys.no/xmlui/bitstream/handle/11250/299460/Rune Borge Kalleberg.pdf?sequence=1.Google Scholar
Rafael-Michael Karampatsis. 2015. CDTDS: Predicting paraphrases in twitter via support vector regression. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval’15). 75--79.Google ScholarCross Ref
Daniel Karaś, Martyna Śpiewak, and Piotr Sobecki. 2017. OPI-JSA at CLEF 2017: Author clustering and style breach detection—Notebook for PAN at CLEF 2017. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’17).Google Scholar
Roman Kern. 2013. Grammar checker features for author identification and author profiling—Notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Imtiaz H. Khan, Muazzam A. Siddiqui, Kamal M. Jambi, Muhammad Imran, and Abobakr A. Bagais. 2014. Query optimization in Arabic plagiarism detection: An empirical study. Int. J. Intell. Syst. Appl. 7, 1 (2014), 73.Google Scholar
Jamal Ahmad Khan. 2017. Style breach detection: An unsupervised detection model—Notebook for PAN at CLEF 2017. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’17).Google Scholar
Mahmoud Khonji and Youssef Iraqi. 2014. A Slightly-modified GI-based Author-verifier with Lots of Features (ASGALF)—Notebook for PAN at CLEF 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14).Google Scholar
Khadijeh Khoshnavataher, Vahid Zarrabi, Salar Mohtaj, and Habibollah Asghari. 2015. Developing monolingual persian corpus for extrinsic plagiarism detection using artificial obfuscation—Notebook for PAN at CLEF 2015. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’15).Google Scholar
Barbara Kitchenham. 2004. Procedures for performing systematic reviews. Keele University Technical Report TR/SE-0401. Keele University. 33.Google Scholar
Barbara Kitchenham, O. Pearl Brereton, David Budgen, Mark Turner, John Bailey, and Stephen Linkman. 2009. Systematic literature reviews in software engineering—A systematic literature review. Inf. Softw. Technol. 51, 1 (2009), 7--15. DOI:10.1016/j.infsof.2008.09.009Google ScholarDigital Library
Mirco Kocher. 2016. UniNE at CLEF 2016: Author clustering—Notebook for PAN at CLEF 2016. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’16).Google Scholar
Mirco Kocher and Jacques Savoy. 2015. UniNE at CLEF 2015: Author identification—Notebook for PAN at CLEF 2015. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’15).Google Scholar
Mirco Kocher and Jacques Savoy. 2017. UniNE at CLEF 2017: Author clustering—Notebook for PAN at CLEF 2017. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’17).Google Scholar
Leilei Kong, Yong Han, Zhongyuan Han, Haihao Yu, Qibo Wang, Tinglei Zhang, and Haoliang Qi. 2014. Source retrieval based on learning to rank and text alignment based on plagiarism type recognition for plagiarism detection—Notebook for PAN at CLEF 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14).Google Scholar
Leilei Kong, Zhimao Lu, Yong Han, Haoliang Qi, Zhongyuan Han, Qibo Wang, Zhenyuan Hao, and Jing Zhang. 2015. Source retrieval and text alignment corpus construction for plagiarism detection—Notebook for PAN at CLEF 2015. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’15).Google Scholar
Leilei Kong, Zhimao Lu, Haoliang Qi, and Zhongyuan Han. 2014. Detecting high obfuscation plagiarism: Exploring multi-features fusion via machine learning. Int. J. u-and e-Serv. Sci. Technol. 7, 4 (2014), 385--396.Google Scholar
Leilei Kong, Haoliang Qi, Cuixia Du, Mingxing Wang, and Zhongyuan Han. 2013. Approaches for source retrieval and text alignment of plagiarism detection—Notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Moshe Koppel and Yaron Winter. 2014. Determining if two documents are written by the same author. J. Assoc. Inf. Sci. Technol. 65, 1 (2014), 178--187.Google ScholarCross Ref
Niraj Kumar. 2014. A graph based automatic plagiarism detection technique to handle artificial word reordering and paraphrasing. In Computational Linguistics and Intelligent Text Processing. 481--494.Google Scholar
Marcin Kuta and Jacek Kitowski. 2014. Optimisation of character n-gram profiles method for intrinsic plagiarism detection. In Artificial Intelligence and Soft Computing. 500--511.Google Scholar
Mikhail Kuznetsov, Anastasia Motrenko, Rita Kuznetsova, and Vadim Strijov. 2016. Methods for intrinsic plagiarism detection and author diarization. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’16). 912--919. Retrieved from http://ceur-ws.org/Vol-1609/.Google Scholar
Robert Layton, Paul Watters, and Richard Dazeley. 2013. Local n-grams for author identification—notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Paola Ledesma, Gibran Fuentes, Gabriela Jasso, Angel Toledo, and and Ivan Meza. 2013. Distance learning for author verification—Notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Taemin Lee, Jeongmin Chae, Kinam Park, and Soonyoung Jung. 2013. CopyCaptor: Plagiarized source retrieval system using global word frequency and local feedback—Notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Chi-kiu Lo, Cyril Goutte, and Michel Simard. 2016. CNRC at SemEeval-2016 task 1: Experiments in crosslingual semantic textual similarity. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval’16). 668--673.Google Scholar
Tara C. Long, Mounir Errami, Angela C. George, Zhaohui Sun, and Harold R. Garner. 2009. Responding to possible plagiarism. Science 323, 5919 (2009), 1293--1294. DOI:10.1126/science.1167408Google Scholar
Ahmed Magooda, Ashraf Y. Mahgoub, Mohsen Rashwan, Magda B. Fayek, and Hazem Raafat. 2015. RDI System for extrinsic plagiarism detection (RDI_RED)—Working Notes for PAN-AraPlagDet at FIRE 2015. In Proceedings of the Forum for Information Retrieval Evaluation (FIRE’15).Google Scholar
Peyman Mahdavi, Zahra Siadati, and Farzin Yaghmaee. 2014. Automatic external persian plagiarism detection using vector space model. In Proceedings of the 2014 4th International eConference on Computer and Knowledge Engineering (ICCKE’14). 697--702.Google ScholarCross Ref
Ashraf Y. Mahgoub, Ahmed Magooda, Mohsen Rashwan, Magda B. Fayek, and Hazem Raafat. 2015. RDI System for intrinsic plagiarism detection (RDI_RID)—Working Notes for PAN-AraPlagDet at FIRE 2015. In Proceedings of the Forum for Information Retrieval Evaluation (FIRE’15).Google Scholar
Promita Maitra, Souvick Ghosh, and Dipankar Das. 2015. Authorship verification - an approach based on random forest—Notebook for PAN at CLEF 2015. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’15).Google Scholar
Cristhian Mayor, Josue Gutierrez, Angel Toledo, Rodrigo Martinez, Paola Ledesma, Gibran Fuentes, and and Ivan Meza. 2014. A single author style representation for the author verification task—Notebook for PAN at CLEF 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14).Google Scholar
Norman Meuschke and Bela Gipp. 2013. State-of-the-art in detecting academic plagiarism. Int. J. Educ. Integr. 9, 1 (2013), 50--71.Google ScholarCross Ref
Norman Meuschke and Bela Gipp. 2014. Reducing computational effort for plagiarism detection by using citation characteristics to limit retrieval space. In Proceedings of the 14th ACM/IEEE-CS Joint Conference on Digital Libraries. 197--200.Google ScholarDigital Library
Norman Meuschke, Christopher Gondek, Daniel Seebacher, Corinna Breitinger, Daniel A. Keim, and Bela Gipp. 2018. An adaptive image-based plagiarism detection approach. In Proceedings of the ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL’18). DOI:10.1145/3197026.3197042Google ScholarDigital Library
Norman Meuschke, Moritz Schubotz, Felix Hamborg, Tomáš Skopal, and Bela Gipp. 2017. Analyzing mathematical content to detect academic plagiarism. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (CIKM’17). 2211--2214. DOI:10.1145/3132847.3133144Google ScholarDigital Library
Norman Meuschke, Nicolas Siebeck, Moritz Schubotz, and Bela Gipp. 2017. Analyzing semantic concept patterns to detect academic plagiarism. In Proceedings of the 6th International Workshop on Mining Scientific Publications (WOSP’17). 46--53. DOI:10.1145/3127526.3127535Google ScholarDigital Library
Norman Meuschke, Vincent Stange, Moritz Schubotz, Michael Kramer, and Bela Gipp. 2019. Improving academic plagiarism detection for STEM documents by analyzing mathematical content and citations. In Proceeedings of the ACM/IEEE Joint Conference on Digital Libraries (JCDL’19).Google ScholarDigital Library
Pashutan Modaresi and Philipp Gross. 2014. A language independent author verifier using fuzzy c-means clustering—Notebook for PAN at CLEF 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14).Google Scholar
H. F. Moed, W. J. M. Burger, J. G. Frankfort, and A. F. J. Van Raan. 1985. The application of bibliometric indicators: Important field- and time-dependent factors to be considered. Scientometrics 8, 3--4 (1985), 177--203. DOI:10.1007/BF02016935Google ScholarCross Ref
Majid Mohebbi and Alireza Talebpour. 2016. Texts semantic similarity detection based graph approach. Int. Arab J. Inf. Technol. 13, 2 (2016), 246--251.Google Scholar
Mozhgan Momtaz, Kayvan Bijari, Mostafa Salehi, and Hadi Veisi. 2016. Graph-based approach to text alignment for plagiarism detection in persian documents. In Proceedings of the Forum for Information Retrieval Evaluation (FIRE’16). 176--179.Google Scholar
Erwan Moreau, Arun Jayapal, and Carl Vogel. 2014. Author verification: exploring a large set of parameters using a genetic algorithm—Notebook for PAN at CLEF 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14).Google Scholar
Erwan Moreau, Arun Jayapal, Gerard Lynch, and Carl Vogel. 2015. Author verification: Basic stacked generalization applied to predictions from a set of heterogeneous learners—Notebook for PAN at CLEF 2015. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’15).Google Scholar
Erwan Moreau and Carl Vogel. 2013. Style-based distance features for author verification—Notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Maxim Mozgovoy, Tuomo Kakkonen, and Georgina Cosma. 2010. Automatic student plagiarism detection: Future perspectives. J. Educ. Comput. Res. 43, 4 (2010), 511--531.Google ScholarCross Ref
Aibek Musaev, De Wang, Saajan Shridhar, and Calton Pu. 2015. Fast text classification using randomized explicit semantic analysis. In Proceedings of the 2015 IEEE International Conference on Information Reuse and Integration. 364--371. DOI:10.1109/IRI.2015.62Google ScholarDigital Library
El Moatez Billah Nagoudi, Ahmed Khorsi, Hadda Cherroun, and Didier Schwab. 2018. 2L-APD: A Two-Level Plagiarism Detection System for Arabic Documents. Cybern. Inf. Technol. 18, 1 (2018), 124--138. DOI:10.2478/cait-2018-0011Google Scholar
Rao Muhammad Adeel Nawab, Mark Stevenson, and Paul Clough. 2017. An IR-based approach utilizing query expansion for plagiarism detection in MEDLINE. IEEE/ACM Trans. Comput. Biol. Bioinforma. 14, 4 (2017), 796--804. DOI:10.1109/TCBB.2016.2542803Google ScholarDigital Library
Philip M. Newton. 2018. How common is commercial contract cheating in higher education and is it increasing? A Systematic Review. Front. Educ. 3 (2018). DOI:10.3389/feduc.2018.00067Google Scholar
Le Thanh Nguyen, Nguyen Xuan Toan, and Dinh Dien. 2016. Vietnamese plagiarism detection method. In Proceedings of the 7th Symposium on Information and Communication Technology (SoICT’16). 44--51. DOI:10.1145/3011077.3011109Google ScholarDigital Library
Gabriel Oberreuter and Juan D. VeláSquez. 2013. Text mining applied to plagiarism detection: The use of words for detecting deviations in the writing style. Exp. Syst. Appl. 40, 9 (2013), 3756--3763.Google ScholarDigital Library
Milan Ojsteršek, Janez Brezovnik, Mojca Kotar, Marko Ferme, Goran Hrovat, Albin Bregant, and Mladen Borovič. 2014. Establishing of a slovenian open access infrastructure: A technical point of view. Program 48, 4 (2014), 394--412. DOI:10.1108/PROG-02-2014-0005Google ScholarCross Ref
Adeva Oktoveri, Agung Toto Wibowo, and Ari Moesriami Barmawi. 2014. Non-relevant document reduction in anti-plagiarism using asymmetric similarity and AVL tree index. In Proceedings of the 2014 5th International Conference on Intelligent and Advanced Systems (ICIAS’14). 1--5. DOI:10.1109/ICIAS.2014.6869547Google ScholarCross Ref
Ahmed Hamza Osman and Naomie Salim. 2013. An improved semantic plagiarism detection scheme based on Chi-squared automatic interaction detection. In Proceedings of the 2013 International Conference on Computing, Electrical and Electronic Engineering (ICCEEE’13). 640--647. DOI:10.1109/ICCEEE.2013.6634015Google ScholarCross Ref
Caleb Owens and Fiona A. White. 2013. A 5‐year systematic strategy to reduce plagiarism among first‐year psychology university students. Aust. J. Psychol. 65, 1 (2013), 14--21. DOI:10.1111/ajpy.12005Google ScholarCross Ref
María Leonor Pacheco, Kelwin Fernandes, and Aldo Porco. 2015. Random forest with increased generalization: A universal background approach for authorship verification—Notebook for PAN at CLEF 2015. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’15).Google Scholar
Yurii Palkovskii and Alexei Belov. 2013. Using hybrid similarity methods for plagiarism detection—Notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Yurii Palkovskii and Alexei Belov. 2014. Developing high-resolution universal multi-type n-gram plagiarism detector. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14). 984--989.Google Scholar
Guy Paré, Marie-Claude Trudel, Mirou Jaana, and Spyros Kitsiou. 2015. Synthesizing information systems knowledge: A typology of literature reviews. Inf. Manag. 52, 2 (2015), 183--199. DOI:10.1016/j.im.2014.08.008Google ScholarCross Ref
Merin Paul and Sangeetha Jamal. 2015. An Improved SRL based plagiarism detection technique using sentence ranking. Procedia Comput. Sci. 46 (2015), 223--230. DOI:10.1016/j.procs.2015.02.015Google ScholarDigital Library
Ellie Pavlick, Pushpendre Rastogi, Juri Ganitkevitch, Benjamin Van Durme, and Chris Callison-Burch. 2015. PPDB 2.0: Better paraphrase ranking, fine-grained entailment relations, word embeddings, and style classification. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 425--430.Google ScholarCross Ref
Jian Peng, Kim-Kwang Raymond Choo, and Helen Ashman. 2016. Bit-level n-gram based forensic authorship analysis on social media: Identifying individuals from linguistic profiles. J. Netw. Comput. Appl. 70 (2016), 171--182. DOI:10.1016/j.jnca.2016.04.001Google ScholarDigital Library
Solange de L. Pertile, Viviane P. Moreira, and Paolo Rosso. 2015. Comparing and combining content‐ and citation‐based approaches for plagiarism detection. J. Assoc. Inf. Sci. Technol. 67, 10 (2015), 2511--2526. DOI:10.1002/asi.23593Google ScholarDigital Library
Solange de L. Pertile, Paolo Rosso, and Viviane P. Moreira. 2013. Counting co-occurrences in citations to identify plagiarised text fragments. In Proceedings of the International Conference of the Cross-Language Evaluation Forum for European Languages. 150--154.Google Scholar
Timo Petmanson. 2013. Authorship identification using correlations of frequent features—Notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Mohammad Taher Pilehvar, David Jurgens, and Roberto Navigli. 2013. Align, disambiguate and walk: A unified approach for measuring semantic similarity. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1341--1351.Google Scholar
Gaspar Pizarro V. and Juan D. Velásquez. 2017. Docode 5: Building a real-world plagiarism detection system. Eng. Appl. Artif. Intell. 64 (Jun. 2017), 261--271. DOI:10.1016/j.engappai.2017.06.001Google ScholarDigital Library
Juan-Pablo Posadas-Durán, Grigori Sidorov, Ildar Batyrshin, and Elibeth Mirasol-Meléndez. 2015. Author verification using syntactic n-grams—Notebook for PAN at CLEF 2015. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’15).Google Scholar
Martin Potthast, Tim Gollub, Matthias Hagen, Martin Tippmann, Johannes Kiesel, Paolo Rosso, Efstathios Stamatatos, and Benno Stein. 2013. Overview of the 5th International Competition on Plagiarism Detection. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Martin Potthast, Matthias Hagen, Anna Beyer, Matthias Busse, Martin Tippmann, Paolo Rosso, and Benno Stein. 2014. Overview of the 6th International Competition on Plagiarism Detection. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14).Google Scholar
Martin Potthast, Matthias Hagen, and Benno Stein. 2016. Author Obfuscation: Attacking the state of the art in authorship verification. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’16).Google Scholar
Martin Potthast, Francisco Rangel, Michael Tschuggnall, Efstathios Stamatatos, Paolo Rosso, and Benno Stein. 2017. Overview of PAN’17: Author identification, author profiling, and author obfuscation. In Proceedings of the 7th International Conference of the CLEF Initiative. DOI:10.1007/978-3-319-65813-1_25Google Scholar
Martin Potthast, Benno Stein, Alberto Barrón-Cedeño, and Paolo Rosso. 2010. An Evaluation framework for plagiarism detection. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters (COLING’10). 997--1005.Google ScholarDigital Library
Martin Potthast, Benno Stein, Andreas Eiselt, Alberto Barrón-Cedeño, and Paolo Rosso. 2009. Overview of the 1st international competition on plagiarism detection. In Proceedings of the SEPLN 09 Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN’09). 1--9.Google Scholar
Amit Prakash and Sujan Kumar Saha. 2014. Experiments on document chunking and query formation for plagiarism source retrieval—Notebook for PAN at CLEF 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14).Google Scholar
Piotr Przybyła, Nhung T. H. Nguyen, Matthew Shardlow, Georgios Kontonatsios, and Sophia Ananiadou. 2016. NaCTeM at SemEval-2016 Task 1: Inferring sentence-level semantic similarity from an ensemble of complementary lexical and sentence-level features. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval’16). 614--620.Google ScholarCross Ref
Javad Rafiei, Salar Mohtaj, Vahid Zarrabi, and Habibollah Asghari. 2015. Source retrieval plagiarism detection based on noun phrase and keyword phrase extraction—Notebook for PAN at CLEF 2015. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’15).Google Scholar
Shima Rakian, Esfahani Faramarz Safi, and Hamid Rastegari. 2015. A Persian fuzzy plagiarism detection approach. J. Inf. Syst. Telecommun. 3, 3 (2015), 182--190.Google Scholar
N Riya Ravi and Deepa Gupta. 2015. Efficient paragraph based chunking and download filtering for plagiarism source retrieval—Notebook for PAN at CLEF 2015. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’15).Google Scholar
N. Riya Ravi, Vani Kanjirangat, and Deepa Gupta. 2016. Exploration of fuzzy C means clustering algorithm in external plagiarism detection system. In Intelligent Systems Technologies and Applications. Springer, 127--138.Google Scholar
Andi Rexha, Stefan Klampfl, Mark Kröll, and Roman Kern. 2015. Towards authorship attribution for bibliometrics using stylometric features. In Proceedings of the Conference on Computational Linguistics and Bibliometrics co-located with the International Conference on Scientometrics and Informetrics (CLBib@ ISSI). 44--49.Google Scholar
Diego Antonio Rodríguez Torrejón and José Manuel Martín Ramos. 2014. CoReMo 2.3 Plagiarism detector text alignment module—Notebook for PAN at CLEF 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14).Google Scholar
Paolo Rosso, Francisco Rangel, Martin Potthast, Efstathios Stamatatos, Michael Tschuggnall, and Benno Stein. 2016. Overview of PAN’16. In Experimental IR Meets Multilinguality, Multimodality, and Interaction. 332--350.Google Scholar
Frantz Rowe. 2014. What literature review is not: Diversity, boundaries and recommendations. Eur. J. Inf. Syst. 23, 3 (2014), 241--255. DOI:10.1057/ejis.2014.7Google ScholarCross Ref
Barbara Rychalska, Katarzyna Pakulska, Krystyna Chodorowska, Wojciech Walczak, and Piotr Andruszkiewicz. 2016. Samsung Poland NLP Team at SemEval-2016 Task 1: Necessity for diversity; combining recursive autoencoders, WordNet and ensemble methods to measure semantic similarity. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval’16). 602--608.Google ScholarCross Ref
Kamil Safin and Rita Kuznetsova. 2017. Style breach detection with neural sentence embeddings—Notebook for PAN at CLEF 2017. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’17).Google Scholar
Anuj Saini and Aayushi Verma. 2016. Anuj@ DPIL-FIRE2016: a novel paraphrase detection method in hindi language using machine learning. In Proceedings of the Forum for Information Retrieval Evaluation. 141--152.Google Scholar
Miguel A. Sanchez-Perez, Alexander Gelbukh, and Grigori Sidorov. 2015. Dynamically adjustable approach through obfuscation type recognition—Notebook for PAN at CLEF 2015. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’15).Google Scholar
Miguel A Sanchez-Perez, Grigori Sidorov, and Alexander F Gelbukh. 2014. A winning approach to text alignment for text reuse detection at PAN 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14). 1004--1011.Google Scholar
Fernando Sánchez-Vega, Esaú Villatoro-Tello, Manuel Montes-y-Gómez, Luis Villaseñor-Pineda, and Paolo Rosso. 2013. Determining and characterizing the reused text for plagiarism detection. J. Assoc. Inf. Sci. Technol. 65, 5 (2013), 1804--1813. DOI:10.1016/j.eswa.2012.09.021Google ScholarDigital Library
Yunita Sari and Mark Stevenson. 2015. A machine learning-based intrinsic method for cross-topic and cross-genre authorship verification—Notebook for PAN at CLEF 2015. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’15).Google Scholar
Yunita Sari and Mark Stevenson. 2016. Exploring word embeddings and character n-grams for author clustering—Notebook for PAN at CLEF 2016. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’16).Google Scholar
Satyam, Anand, Arnav Kumar Dawn, and and Sujan Kumar Saha. 2014. Statistical analysis approach to author identification using latent semantic analysis—Notebook for PAN at CLEF 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14).Google Scholar
Taneeya Satyapanich, Hang Gao, and Tim Finin. 2015. Ebiquity: Paraphrase and semantic similarity in twitter using skipgrams. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval’15). 51--55.Google ScholarCross Ref
Andreas Schmidt, Reinhold Becker, Daniel Kimmig, Robert Senger, and Steffen Scholz. 2014. A concept for plagiarism detection based on compressed bitmaps. In Procceedings of the 6th International Conference on Advances in Databases, Knowledge, and Data Applications. 30--34.Google Scholar
Shachar Seidman. 2013. Authorship verification using the impostors method—Notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF13).Google Scholar
Prasha Shrestha, Suraj Maharjan, and Thamar Solorio. 2014. Machine translation evaluation metric for text alignment—Notebook for PAN at CLEF 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14).Google Scholar
Prasha Shrestha and Thamar Solorio. 2013. Using a variety of n-grams for the detection of different kinds of plagiarism. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Muazzam Ahmed Siddiqui, Imtiaz Hussain Khan, Kamal Mansoor Jambi, Salma Omar Elhaj, and Abobakr Bagais. 2014. Developing an arabic plagiarism detection corpus. Comput. Sci. Inf. Technol. 4, 2014 (2014), 261--269. DOI:10.5121/csit.2014.41221Google Scholar
L. Sindhu and Sumam Mary Idicula. 2015. Fingerprinting based detection system for identifying plagiarism in malayalam text documents. In Proceedings of the 2015 International Conference on Computing and Network Communications (CoCoNet’15). 553--558. DOI:10.1109/CoCoNet.2015.7411242Google ScholarCross Ref
Abdul Sittar, Hafiz Rizwan Iqbal, and Rao Muhammad Adeel Nawab. 2016. Author diarization using cluster-distance approach. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’16). 1000--1007.Google Scholar
Sidik Soleman and Ayu Purwarianti. 2014. Experiments on the Indonesian plagiarism detection using latent semantic analysis. In Proceedings of the 2014 2nd International Conference on Information and Communication Technology (ICoICT’14). 413--418. DOI:10.1109/ICoICT.2014.6914098Google ScholarCross Ref
Hussein Soori, Michal Prilepok, Jan Platos, Eshetie Berhan, and Vaclav Snasel. 2014. Text similarity based on data compression in Arabic. In AETA 2013: Recent Advances in Electrical Engineering and Related Sciences. Springer, 211--220.Google Scholar
Efstathios Stamatatos, Walter Daelemans, Ben Verhoeven, Martin Potthast, Benno Stein, Patrick Juola, Miguel A. Sanchez-Perez, and Alberto Barrón-Cedeño. 2014. Overview of the author identification task at PAN 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14).Google Scholar
Efstathios Stamatatos, Martin Potthast, Francisco Rangel, Paolo Rosso, and Benno Stein. 2015. Overview of the PAN/CLEF 2015 Evaluation Lab. In Experimental IR Meets Multilinguality, Multimodality, and Interaction: Proceedings of the 6th International Conference of the CLEF Initiative (CLEF’15). 518--538. DOI:10.1007/978-3-319-24027-5_49Google ScholarDigital Library
Efstathios Stamatatos, Walter Daelemans Ben Verhoeven, Patrick Juola, Aurelio López-López, Martin Potthast, and Benno Stein. 2015. Overview of the author identification task at PAN 2015. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’15).Google Scholar
Benno Stein, Sven zu Eissen, and Martin Potthast. 2007. Strategies for retrieving plagiarized documents. In Proceedings of the 30th Annual International ACM SIGIR Conference. 825--826. DOI:10.1145/1277741.1277928Google ScholarDigital Library
Imam Much Ibnu Subroto and Ali Selamat. 2014. Plagiarism detection through internet using hybrid artificial neural network and support vectors machine. Telecommun. Comput. Electron. Control. 12, 1 (2014), 209--218.Google Scholar
Šimon Suchomel and Michal Brandejs. 2014. Heterogeneous queries for synoptic and phrasal search—Notebook for PAN at CLEF 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14).Google Scholar
Šimon Suchomel and Michal Brandejs. 2015. Improving synoptic querying for source retrieval. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’15).Google Scholar
Šimon Suchomel, Jan Kasprzak, and Michal Brandejs. 2013. Diverse queries and feature type selection for plagiarism discovery—Notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
M. D. Arafat Sultan, Steven Bethard, and Tamara Sumner. 2014. DLS@CU: Sentence similarity from word alignment. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval’14). 241--246.Google ScholarCross Ref
M. D. Arafat Sultan, Steven Bethard, and Tamara Sumner. 2014. Back to basics for monolingual alignment: Exploiting word similarity and contextual evidence. Trans. Assoc. Comput. Linguist. 2 (2014), 219--230.Google ScholarCross Ref
M. D. Arafat Sultan, Steven Bethard, and Tamara Sumner. 2015. DLS@CU: Sentence similarity from word alignment and semantic vector composition. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval’15). 148--153.Google ScholarCross Ref
Junfeng Tian and Man Lan. 2016. ECNU at SemEval-2016 Task 1: Leveraging word embedding from macro and micro views to boost performance for semantic textual similarity. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval’16). 621--627.Google ScholarCross Ref
Diego A. Rodríguez Torrejón and José Manuel Martín Ramos. 2013. Text alignment module in CoReMo 2.1 plagiarism detector. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Michael Tschuggnall and Günther Specht. 2013. Detecting plagiarism in text documents through grammar-analysis of authors. Datenbanksysteme für Business, Technologie und Web (BTW) 2028, Volker Markl, Gunter Saake, Kai-Uwe Sattler, Gregor Hackenbroich, Bernhard Mitschang, Theo Härder, and Veit Köppen (Eds.). Gesellschaft für Informatik e.V., 241--259.Google Scholar
Michael Tschuggnall and Günther Specht. 2013. Using grammar-profiles to intrinsically expose plagiarism in text documents. In Natural Language Processing and Information Systems. 297--302.Google Scholar
Michael Tschuggnall, Efstathios Stamatatos, Ben Verhoeven, Walter Daelemans, Günther Specht, Benno Stein, and Martin Potthast. 2017. Overview of the author identification task at PAN-2017: Style breach detection and author clustering. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’17).Google Scholar
Alper Kursat Uysal and Serkan Gunal. 2014. Text classification using genetic algorithm oriented latent semantic features. Exp. Syst. Appl. 41, 13 (2014), 5938--5947. DOI:10.1016/j.eswa.2014.03.041Google ScholarCross Ref
Vani Kanjirangat and Deepa Gupta. 2014. Using K-means cluster based techniques in external plagiarism detection. In Proceedings of the 2014 International Conference on Contemporary Computing and Informatics (IC3I’14). 1268--1273. DOI:10.1109/IC3I.2014.7019659Google Scholar
Vani Kanjirangat and Deepa Gupta. 2015. Investigating the impact of combined similarity metrics and POS tagging in extrinsic text plagiarism detection system. In Proceedings of the 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI’15). 1578--1584. DOI:10.1109/ICACCI.2015.7275838Google Scholar
Vani Kanjirangat and Deepa Gupta. 2016. Study on extrinsic text plagiarism detection techniques and tools. J. Eng. Sci. Technol. Rev. 9, 5 (2016), 9--23.Google ScholarCross Ref
Vani Kanjirangat and Deepa Gupta. 2017. Detection of idea plagiarism using syntax--semantic concept extractions with genetic algorithm. Exp. Syst. Appl. 73 (2017), 11--26. DOI:10.1016/j.eswa.2016.12.022Google ScholarCross Ref
Vani Kanjirangat and Deepa Gupta. 2017. Identifying document-level text plagiarism: A two-phase approach. J. Eng. Sci. Technol. 12, 12 (2017), 3226--3250.Google Scholar
Vani Kanjirangat and Deepa Gupta. 2017. Text plagiarism classification using syntax based linguistic features. Exp. Syst. Appl. 88 (2017), 448--464. DOI:10.1016/j.eswa.2017.07.006Google ScholarDigital Library
Anna Vartapetiance and Lee Gillam. 2013. A textual modus operandi: surrey's simple system for author identification—notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Juan D Velásquez, Yerko Covacevich, Francisco Molina, Edison Marrese-Taylor, Cristián Rodríguez, and Felipe Bravo-Marquez. 2016. DOCODE 3.0 (DOcument COpy DEtector): A system for plagiarism detection by applying an information fusion process from multiple documental data sources. Inf. Fus. 27 (2016), 64--75. DOI:10.1016/j.inffus.2015.05.006Google ScholarDigital Library
Ondřej Veselý, Tomáš Foltýnek, and Jiří Rybička. 2013. Source retrieval via naïve approach and passage selection heuristics—Notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Darnes Vilariño, David Pinto, Helena Gómez, Saúl León, and Esteban Castillo. 2013. Lexical-syntactic and graph-based features for authorship verification—Notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Ngoc Phuoc An Vo, Octavian Popescu, and Tommaso Caselli. 2014. FBK-TR: SVM for semantic relatedeness and corpus patterns for RTE. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval’14). 289--293.Google Scholar
Hai Hieu Vu, Jeanne Villaneau, Farida Saïd, and Pierre-François Marteau. 2014. Sentence similarity by combining explicit semantic analysis and overlapping N-grams. In Text, Speech and Dialogue. 201--208.Google Scholar
Elizabeth Wager. 2014. Defining and responding to plagiarism. Learn. Publ. 27, 1 (2014), 33--42. DOI:10.1087/20140105Google ScholarCross Ref
Wafa Wali, Bilel Gargouri, and Abdelmajid Ben Hamadou. 2015. Supervised learning to measure the semantic similarity between arabic sentences. In Computational Collective Intelligence. 158--167.Google Scholar
John Walker. 1998. Student plagiarism in universities: What are we doing about it? High. Educ. Res. Dev. 17, 1 (1998), 89--106. DOI:10.1080/0729436980170105Google ScholarCross Ref
Shuai Wang, Haoliang Qi, Leilei Kong, and Cuixia Nu. 2013. Combination of VSM and jaccard coefficient for external plagiarism detection. In Proceedings of the 2013 International Conference on Machine Learning and Cybernetics. 1880--1885. DOI:10.1109/ICMLC.2013.6890902Google Scholar
Debora Weber-Wulff. 2014. False feathers: A Perspective on Academic Plagiarism. Springer, Berlin.Google Scholar
Debora Weber-Wulff, Christopher Möller, Jannis Touras, and Elin Zincke. 2013. Plagiarism Detection Software Test 2013. Retrieved from http://plagiat.htw-berlin.de/wp-content/uploads/Testbericht-2013-color.pdf.Google Scholar
Agung Toto Wibowo, Kadek W. Sudarmadi, and Ari M. Barmawi. 2013. Comparison between fingerprint and winnowing algorithm to detect plagiarism fraud on Bahasa Indonesia documents. In Proceedings of the 2013 International Conference of Information and Communication Technology (ICoICT’13). 128--133. DOI:10.1109/ICoICT.2013.6574560Google Scholar
Kyle Williams, Hung-Hsuan Chen, Sagnik Ray Chowdhury, and C. Lee Giles. 2013. Unsupervised ranking for plagiarism source retrieval—Notebook for PAN at CLEF 2013. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’13).Google Scholar
Kyle Williams, Hung-Hsuan Chen, and C. Lee Giles. 2014. Classifying and ranking search engine results as potential sources of plagiarism. In Proceedings of the 2014 ACM Symposium on Document Engineering (DocEng’14). 97--106. DOI:10.1145/2644866.2644879Google ScholarDigital Library
Kyle Williams, Hung-Hsuan Chen, and C. Lee Giles. 2014. Supervised ranking for plagiarism source retrieval—notebook for PAN at CLEF 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14).Google Scholar
Xuchen Yao, Benjamin Van Durme, Chris Callison-Burch, and Peter Clark. 2013. A lightweight and high performance monolingual word aligner. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 702--707.Google Scholar
Takeru Yokoi. 2015. Sentence-based plagiarism detection for japanese document based on common nouns and part-of-speech structure. In Intelligent Software Methodologies, Tools and Techniques. 297--308.Google Scholar
Guido Zarrella, John Henderson, Elizabeth M. Merkhofer, and Laura Strickhart. 2015. MITRE: Seven systems for semantic similarity in tweets. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval’15). 12--17.Google ScholarCross Ref
Chunxia Zhang, Xindong Wu, Zhendong Niu, and Wei Ding. 2014. Authorship identification from unstructured texts. Knowl.-Based Syst. 66 (2014), 99--111. DOI:10.1016/j.knosys.2014.04.025Google ScholarDigital Library
Jiang Zhao and Man Lan. 2015. Ecnu: Leveraging word embeddings to boost performance for paraphrase in twitter. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval’15). 34--39.Google ScholarCross Ref
Valentin Zmiycharov, Dimitar Alexandrov, Hristo Georgiev, Yasen Kiprov, Georgi Georgiev, Ivan Koychev, and Preslav Nakov. 2016. Experiments in authorship-link ranking and complete author clustering—Notebook for PAN at CLEF 2016. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’16).Google Scholar
Sven Meyer Zu Eissen and Benno Stein. 2006. Intrinsic plagiarism detection. In Proceedings of the European Conference on Information Retrieval. 565--569.Google ScholarDigital Library
Denis Zubarev and Ilya Sochenkov. 2014. Using sentence similarity measure for plagiarism source retrieval—notebook for PAN at CLEF 2014. In Proceedings of the Conference and Labs of the Evaluation Forum and Workshop (CLEF’14).Google Scholar
Teddi Fishman. 2009. We know it when we see it' is not good enough: Toward a standard definition of plagiarism that transcends theft, fraud, and copyright. In Proceedings 4th Asia Pacific Conference on Educational Integrity (4APCEI'09). 5.Google Scholar

Index Terms

Academic Plagiarism Detection: A Systematic Literature Review

Recommendations

Reducing computational effort for plagiarism detection by using citation characteristics to limit retrieval space

JCDL '14: Proceedings of the 14th ACM/IEEE-CS Joint Conference on Digital Libraries

This paper proposes a hybrid approach to plagiarism detection in academic documents that integrates detection methods using citations, semantic argument structure, and semantic word similarity with character-based methods to achieve a higher detection ...

Read More
Obfuscating plagiarism detection: vulnerabilities and solutions

CompSysTech '11: Proceedings of the 12th International Conference on Computer Systems and Technologies

Plagiarism among student term papers is considered as a major problem these days. To successfully identify this kind of cheating we have to perform check on submitted papers for plagiarism. This has to be done with appropriate plagiarism detection ...

Read More
Analyzing Semantic Concept Patterns to Detect Academic Plagiarism

WOSP 2017: Proceedings of the 6th International Workshop on Mining Scientific Publications

Detecting academic plagiarism is a pressing problem, e.g., for educational and research institutions, funding agencies, and academic publishers. Existing plagiarism detection systems reliably identify copied text, or near copies of text, but often fail ...

Read More

Comments

comments powered by Disqus.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Computing Surveys Volume 52, Issue 6

November 2020

806 pages

ISSN:0360-0300

EISSN:1557-7341

DOI:10.1145/3368196

Editor:

Sartaj Sahni
Department of Computer and Information Science and Engineering

Issue’s Table of Contents
Copyright © 2019 Owner/Author

This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher

Association for Computing Machinery

New York, NY, United States
Publication History
- Published: 16 October 2019
- Revised: 1 August 2019
- Accepted: 1 August 2019
- Received: 1 March 2019
Published in csur Volume 52, Issue 6

Permissions

Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Plagiarism detection

literature review

machine learning

semantic analysis

text-matching software
Qualifiers
- survey
- Research
- Refereed
Conference
Funding Sources
Other Metrics

View Article Metrics

Article Metrics
- 98
  Total Citations
  View Citations
- 19,760
  Total Downloads
- Downloads (Last 12 months)5,168
- Downloads (Last 6 weeks)816
Other Metrics

View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Academic Plagiarism Detection: A Systematic Literature Review

ACM Computing Surveys

Abstract

References

Cited By

Index Terms

Recommendations

Reducing computational effort for plagiarism detection by using citation characteristics to limit retrieval space

Obfuscating plagiarism detection: vulnerabilities and solutions

Analyzing Semantic Concept Patterns to Detect Academic Plagiarism