Abstract
Plagiarism is now pervasive in a variety of spheres of life, including academia and research. The development of plagiarism strategies used by plagiarists makes it difficult for existing approaches to accurately detect plagiarism. Plagiarism is checked using a variety of aspects, including syntactic, lexical, semantic, and structural features. This study examines novel and contemporary plagiarism detection tasks, particularly text-based and monolingual plagiarism detection. We suggested a four-stage innovative approach for detecting plagiarism. The natural language processing (NLP) methodology is used in this framework as opposed to the more conventional string-matching methods. By combining two metrics—skip gram and dice coefficient—on the basis of a corpus-based approach, this system investigates text similarity. Using the deep and shallow NLP approach, the text's deeper meaning is investigated. Our findings indicate that deep NLP is swiftly recognizing heavy revision. Shallow NLP efficiently prepares text for future processing. The findings of Word2vec are comparable to those of straightforward deep NLP techniques, however Word2vec also emphasizes documents that other methods might miss. Deep NLP also records changes in synonyms and phrases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Kashkur M, Parshutin S (2010) Research into plagiarism cases and plagiarism detection methods. Riga Tech Univ Sci J 44(1):138–143
Scanlon PM, Neumann DR (2002) Internet plagiarism among college students. J Coll Stud Dev 43(3):374–385
Meuschke N, Gipp B (2013) State of the art in the detecting academic plagiarism. Int J Educ Integrity 9(1):50–71
Understanding plagiarism linguistic patterns, textual features, and detection methods. IEEE 42(2):133–149
Mihalcea R, Liu H, Lieberman H (2006) Proceedings of the international conference on computational linguistic and intelligent systems. In: (NLP) Natural language processing for (NLP) natural language programming, Text Processing (CICLing), Mexico City (Mexico), 19–25 Feb 2006, pp 319–330
Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537
Stamatatos E (2009) Intrinsic plagiarism detection using character n-gram profiles. In: Proceedings of the Spanish society for natural language processing (SEPLN) international conference, San Sebastian (Spain), 8–10 Sept 2009, pp 38–46
Automatic student plagiarism detection: future perspectives. J Educ Comput Res 43(4):507–527 (2010)
Botana G, Leon J, Olmos R, Escudero I (2010) Latent semantic analysis parameters for essay evaluation using small-scale corpora. J Quant Linguist 17(1):1–29
Micol D, Munoz R, Ferrandez O (2011) Investigating advanced techniques for document content similarity applied to external plagiarism analysis. In: Recent advances in natural language processing (RANLP) conference proceedings, Hissar (Bulgaria), 12–14 Sept 2011, pp 240–246
Using natural language processing for automatic detection of plagiarism. In: Proceedings of the international plagiarism conference (IPC), Northumbria University (Newcastle), 21–23 June, 2010
Kucecka T (2011) Plagiarism detection in obfuscated documents using an N-gram technique. ACM 3(2):67–71
Bose R (2004) Natural language processing: current state and future directions. Int J Comput 12(1):1–11
Parker A, Hamblen J (1989) Computer algorithms for plagiarism detection. IEEE Trans Edu 32:94–99. http://dx.doi.org/10.1109/13.28038
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Praveen Kumar, K., Jaya Kumari, D., Uma Sankar, P. (2023). Utilizing Deep Natural Language Processing to Detect Plagiarism. In: Kumar, A., Ghinea, G., Merugu, S. (eds) Proceedings of the 2nd International Conference on Cognitive and Intelligent Computing. ICCIC 2022. Cognitive Science and Technology. Springer, Singapore. https://doi.org/10.1007/978-981-99-2742-5_29
Download citation
DOI: https://doi.org/10.1007/978-981-99-2742-5_29
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-2741-8
Online ISBN: 978-981-99-2742-5
eBook Packages: Computer ScienceComputer Science (R0)