Comparing Images for Document Plagiarism Detection

Iwanowski, Marcin; Cacko, Arkadiusz; Sarwas, Grzegorz

doi:10.1007/978-3-319-46418-3_47

Marcin Iwanowski¹⁷,
Arkadiusz Cacko¹⁷ &
Grzegorz Sarwas¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9972))

Included in the following conference series:

International Conference on Computer Vision and Graphics

1522 Accesses
4 Citations

Abstract

The paper presents results of research oriented towards an application of image processing methods into document comparisons in view of their application into plagiarism-detection systems. Among all image processing methods, the feature-point ones, thanks to their invariance to various image transforms, are best suited for computing image similarity. In the paper various combination of feature point detectors and descriptors are investigated as potential tool for finding similar images in document. The methods are tested on the database consisting of scientific papers containing 5 well known image processing test images. Also, an idea is presented in the paper how the algorithms computing the image similarity may extend the functionality of plagiarism detection systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The total number of images in the last row of the Table 1 does not sum up to 162 (total number of images) because some multiple images consist of several test ones.
2.
The implementation of all but one detectors and descriptors was based on the appropriate procedures included in MATLAB Computer Vision Systems Toolbox. Only the code for the SIFT method was taken from the VLfeat external MATLAB toolbox.

References

Alahi, A., Ortiz, R., Vandergheynst, P.: FREAK: fast retina keypoint. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 510–517 (2012)
Google Scholar
Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (SURF). Comput. Vis. Image Underst. 110(3), 346–359 (2008). Similarity Matching in Computer Vision and Multimedia
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 886–893 (2005)
Google Scholar
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)
Article MathSciNet Google Scholar
Forczmanski, P., Frejlichowski, D.: Strategies of shape and color fusions for content based image retrieval. In: Kurzynski, M., Puchala, E., Wozniak, M., Zolnierek, A. (eds.) Computer Recognition Systems 2. Advances in Intelligent and Soft Computing, vol. 45, pp. 3–10. Springer, Heidelberg (2007)
Chapter Google Scholar
Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of Fourth Alvey Vision Conference, pp. 147–151 (1988)
Google Scholar
Irving, R.W.: Plagiarism and collusion detection using the Smith-Waterman algorithm, University of Glasgow, p. 9 (2004)
Google Scholar
Kang, N.O., Gelbukh, A., Han, S.Y.: PPChecker: plagiarism pattern checker in document copy detection. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 661–667. Springer, Heidelberg (2006). doi:10.1007/11846406_83
Chapter Google Scholar
Leutenegger, S., Chli, M., Siegwart, R.: BRISK: binary robust invariant scalable keypoints. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2548–2555 (2011)
Google Scholar
Lowe, D.: Object recognition from local scale-invariant features. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157 (1999)
Google Scholar
Lyon, C., Barrett, R., Malcolm, J.: A theoretical basis to the automated detection of copying between texts, and its practical implementation in the ferret plagiarism and collusion detector. Prevention, Practice and Policies, Plagiarism (2004)
Google Scholar
Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: Proceedings of British Machine Vision Conference (2002)
Google Scholar
Nistér, D., Stewénius, H.: Linear time maximally stable extremal regions. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5303, pp. 183–196. Springer, Heidelberg (2008). doi:10.1007/978-3-540-88688-4_14
Chapter Google Scholar
Parker, A., et al.: Computer algorithms for plagiarism detection. IEEE Trans. Educ. 32(2), 94–99 (1989)
Article Google Scholar
Rosten, E., Drummond, T.: Fusing points and lines for high performance tracking. In: IEEE International Conference on Computer Vision, vol. 2, pp. 1508–1511 (2005)
Google Scholar
Torr, P., Zisserman, A.: MLESAC: a new robust estimator with application to estimating image geometry. Comput. Vis. Image Underst. 78(1), 138–156 (2000)
Article Google Scholar

Download references

Acknowledgments

This work was partially supported by the European Union within the European Regional Development Fund. The authors would like also to thank prof. Marek Kowalski for support and fruitful discussions on the plagiarism detection topic.

Author information

Authors and Affiliations

Institute of Control and Industrial Electronics, Warsaw University of Technology, ul. Koszykowa 75, 00-662, Warszawa, Poland
Marcin Iwanowski & Arkadiusz Cacko
Lingaro Sp. z o.o., ul. PułAwska 99a, 02-595, Warszawa, Poland
Grzegorz Sarwas

Authors

Marcin Iwanowski
View author publications
You can also search for this author in PubMed Google Scholar
Arkadiusz Cacko
View author publications
You can also search for this author in PubMed Google Scholar
Grzegorz Sarwas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marcin Iwanowski .

Editor information

Editors and Affiliations

Faculty of Applied Informatics and Math, Warsaw University of Life Sciences , Warsaw, Poland
Leszek J. Chmielewski
Computer Sci and Software Engineering, University of Western Australia , Perth, Australia
Amitava Datta
Faculty of Applied Informatics and Math, Warsaw University of Life Sciences SGGW , Warsaw, Poland
Ryszard Kozera
Institute of Computer Science, Silesian University of Technology , Gliwice, Poland
Konrad Wojciechowski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Iwanowski, M., Cacko, A., Sarwas, G. (2016). Comparing Images for Document Plagiarism Detection. In: Chmielewski, L., Datta, A., Kozera, R., Wojciechowski, K. (eds) Computer Vision and Graphics. ICCVG 2016. Lecture Notes in Computer Science(), vol 9972. Springer, Cham. https://doi.org/10.1007/978-3-319-46418-3_47

Download citation

DOI: https://doi.org/10.1007/978-3-319-46418-3_47
Published: 10 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46417-6
Online ISBN: 978-3-319-46418-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics