An Improved Plagiarism Detection Method: Model and Sample

Fang, Jing; Zhang, Yuanyuan

doi:10.1007/978-1-4614-7010-6_106

Jing Fang³ &
Yuanyuan Zhang⁴

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 236))

1506 Accesses

Abstract

Cosine similarity measure is an efficient plagiarism detection algorithm for documents. However, it may be misled if the document is not properly preprocessed. Furthermore, the weight for the words in the document should depend on its occurrence frequency in the whole digital library. Otherwise, cosine similarity measure may not accurate enough. This paper aims to enhance the accuracy of similarity measure. A preprocessing method and a model to adjust word’s weight according to occurrence frequency are proposed in this paper. The paper also develops a sample to illustrate how to preprocess documents, adjust the weight for the words and calculate the similarity. The sample shows that it gets better result after applying the model in this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Sven, M.E., Benno, S.: Intrinsic plagiarism detection. In: Advances in Information retrieval 28th European Conference on IR Research, ECIR 2006, London, UK, Automatic Conceptual Analysis for plagiarism detection April 10–12, 2006 Proceedings. Lecture Notes in Computer Science, vol. 3936, pp. 565–569. Springer (2006)
Google Scholar
Zechner, M., Muhr, M., Kern, R., Granitzer, M.: External and intrinsic plagiarism detection using vector space models. PAN’s, pp. 47–55 (2009)
Google Scholar
Kang, N., Han, S.Y.: Document copy detection system based on plagiarism patterns. In: CICLing’06 Proceedings of the 7th international conference on computational linguistics and intelligent text processing, pp. 571–574 (2006)
Google Scholar
Si, A., Leong, H.V., Lau, R.W.H.: CHECK: A document plagiarism detection system. Proc. ACM Symp. Applied Comput., 70–77 (1997)
Google Scholar
Dreher, H.: Automatic conceptual analysis for plagiarism detection. J. Issues Informing Sci. Inf. Technol. 601–614 (2007)
Google Scholar
Kang, N., Gelbukh, A., Han, S.: PPChecker: Plagiarism pattern checker in document copy detection. Proc. TSD, 661–667 (2006)
Google Scholar
Timothy, H., Justin, Z.: Methods for Identifying versioned and plagiarized documents. J. Am. Soc. Inform. Sci. Technol. 54(3), 203–215 (2003)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Modern Educational Technology Center, North China Institute of Science and Technology, Hebei, China
Jing Fang
Library, North China Institute of Science and Technology, Hebei, China
Yuanyuan Zhang

Authors

Jing Fang
View author publications
You can also search for this author in PubMed Google Scholar
Yuanyuan Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jing Fang .

Editor information

Editors and Affiliations

Department of Computer Science, University of Texas at Dallas, Richardson, TX, USA
W. Eric Wong
College of Computer and Software, Nanjing University of Information Science and Technology, Nanjing, Jiangsu, People’s Republic of China
Tinghuai Ma

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fang, J., Zhang, Y. (2013). An Improved Plagiarism Detection Method: Model and Sample. In: Wong, W.E., Ma, T. (eds) Emerging Technologies for Information Systems, Computing, and Management. Lecture Notes in Electrical Engineering, vol 236. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-7010-6_106

Download citation

DOI: https://doi.org/10.1007/978-1-4614-7010-6_106
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-7009-0
Online ISBN: 978-1-4614-7010-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics