Skip to main content
Log in

A computational approach for printed document forensics using SURF and ORB features

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Document forgery is quite common nowadays due to the availability of cost-effective scanners and printers. Important documents like certificates, passport, identification cards, etc., are protected using watermarks or signatures. These are made secured with a protective printing mechanism with extrinsic fingerprints. Therefore, it is easy to authenticate such documents. Other documents required a passive approach for their authentication. These approaches look for document inconsistencies for chances of modification. Some of these attempt to detect and fix the source of the printed document. This paper proposes a classifier-based model to identify the source printer and classify the questioned document in one of the printer classes. A novel approach of utilizing Speeded Up Robust Features and Oriented Fast Rotated and BRIEF feature descriptors is proposed for printer attribution. Naive Bayes, k-NN, random forest and different combinations of these classifiers have been experimented for classification. The proposed model can efficiently classify the questioned documents to their respective printer class. An accuracy of 86.5% has been achieved using a combination of Naive Bayes, k-NN, random forest classifiers with a simple majority voting scheme and adaptive boosting methodology.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Ali, G.N., Mikkilineni, A.K., Allebach, J.P., Delp, E.J., Chiang, P.J., Chiu, G.T.: Intrinsic and extrinsic signatures for information hiding and secure printing with electrophotographic devices. In: Proceedings of the Non-impact Printing and Digital Fabrication Conference, New Orleans, Louisiana, vol. 2, pp. 511–515 (2003)

  • Ali, G.N., Mikkilineni, A.K., Delp, E.J., Allebach, J.P., Chiang, P.J., Chiu, G.T.: Application of principal components analysis and gaussian mixture models to printer identification. In: Proceedings of the Non-impact Printing and Digital Fabrication Conference, Salt Lake City, Utah, vol. 1, pp. 301–305 (2004)

  • Bertrand, R., Gomez-Kramer, P., Terrades, O.R., Franco, P., Ogier, J.M.: A system based on intrinsic features for fraudulent document detection. In: Proceedings of the 12th International Conference on Document Analysis and Recognition, Washington DC, USA, pp. 106–110 (2013)

  • Breiman L (2001) Random forests. Mach. Learn. 45(1):5–32

    Article  MATH  Google Scholar 

  • Elkasrawi, S., Shafait, F.: Printer identification using supervised learning for document forgery detection. In: Proceedings of the 11th IAPR International Workshop on Document Analysis Systems, France, pp. 146–150 (2014)

  • Ferreira A, Bondi L, Baroffio L, Bestagini P, Huang J, dos Santos J, Tubaro S, Rocha A (2017) Data-driven feature characterization techniques for laser printer attribution. IEEE Trans. Inf. Forensics Secur 12(8):1860–1873

    Article  Google Scholar 

  • Freund Y, Schapire RE (1999) A Short Introduction to Boosting. J. Jpn. Soc. Artif. Intell. 14(5):771–780

    Google Scholar 

  • Fu YR, Yang SY (2012) CCS-LTP for printer identification based on texture analysis. Int. J. Digit. Content Technol. Appl. 6(13):250–264

    Google Scholar 

  • Gebhardt, J., Goldstein, M., Shafait, F., Dengel, A.: Document authentication using printing technique features and unsupervised anomaly detection. In: Proceedings of the 12th International Conference on Document Analysis and Recognition, Washington, DC, US, pp. 479–483 (2013)

  • Gupta S, Kumar M (2019) Forensic document examination system using boosting and bagging methodologies. Soft Comput. https://doi.org/10.1007/s00500-019-04297-5

    Article  Google Scholar 

  • Jiang F, Fu Y, Gupta BB, Lou F, Rho S, Meng F, Tian Z (2018) Deep learning based multi-channel intelligent attack detection for data security. IEEE Trans. Sustain. Comput. https://doi.org/10.1109/TSUSC.2018.2793284

    Article  Google Scholar 

  • John, G.H., Langley, P.: Estimating Continuous distributions in bayesian classifiers. In: Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence, pp. 338–345 (1995)

  • Joshi S, Khanna N (2018) Single classifier-based passive system for source printer classification using local texture features. IEEE Trans. Inf. Forensics Secur. 13(7):1603–1614

    Article  Google Scholar 

  • Khanna, N., Mikkilineni, A.K., Chiu, G.T.C., Allebach, J.P., Delp, E.J.: Scanner identification using sensor pattern noise. In: Proceedings of the Security, Steganography, and Watermarking of Multimedia Contents, Electronic Imaging, San Jose, CA, US, 65051K1-K11 (2007)

  • Khanna N, Mikkilineni AK, Delp EJ (2009) Scanner identification using feature-based processing and analysis. IEEE Trans. Inf. Forensics Secur. 4(1):123–139

    Article  Google Scholar 

  • Kim M (2017) Simultaneous learning of sentence clustering and class prediction for improved document classification. Int. J. Fuzzy Logic Intell. Syst. 17(1):35–42. https://doi.org/10.5391/IJFIS.2017.17.1.35

    Article  Google Scholar 

  • Li Z, Jiang W, Kenzhebalin D, Gokan A, Allebach J (2018) Intrinsic signatures for forensic identification of SOHO inkjet printers. NIP Digit. Fabric Conf. 1:231–236

    Article  Google Scholar 

  • Mikkilineni, A.K., Chiang, P.J., Ali, G.N., Chiu, G.T.C., Allebach, J.P., Delp, E.J.: Printer identification based on graylevel co-occurrence features for security and forensic applications. In: Proceedings of the Security, Steganography, and Watermarking of Multimedia Contents, Electronic Imaging, California, USA, pp. 430–440 (2005)

  • Mikkilineni, A.K., Chiang, P.J., Ali, G.N., Chiu, G.T.C., Allebach, J.P., Delp, E.J.: Printer identification based on texture features. In: Proceedings of the Non-impact Printing and Digital Fabrication Conference, Society for Imaging Science and Technology, Salt Lake City, Utah, vol. 1, pp. 306–311 (2004)

  • Mikkilineni AK, Khanna N, Delp EJ (2011) Forensic printer detection using intrinsic signatures. In: SPIE proceedings, media watermarking, security, and forensics III, vol. 7880. 78800R. https://doi.org/10.1117/12.876742

  • Olakanmi OO, Dada A (2019) An efficient privacy-preserving approach for secure verifiable outsourced computing on untrusted platforms. Int. J. Cloud Appl. Comput. 9(2):79–98

    Google Scholar 

  • Rasli, R.M., Zalizam, T., Muda, T., Yusof, Y., Bakar, J.A.: Comparative analysis of content based image retrieval techniques using color histogram: a case study of GLCM and K-Means clustering. In: Proceedings of the Third International Conference on Intelligent Systems Modelling and Simulation, pp. 283–286 (2012)

  • Ryu SJ, Lee HY, Cho IW, Lee HK (2008) Document forgery detection with SVM classifier and image quality measures. In: Proceedings of the 9th pacific rim conference on multimedia (PCM’08), pp 486–495

  • Tsai MJ, Liu J (2013) Digital forensics forprinted source identification. In: Proc. IEEE international symposium on circuits and systems. Melbourne, Australia, pp 2347–2350

  • Tsai MJ, Yuadi I, Tao YH (2018) Decision-theoretic model to identify printed sources. Multimed. Tools Appl. 77:27543–27587

    Article  Google Scholar 

  • Van Beusekom J, Shafait F, Breuel TM (2013) Automatic authentication of color laser print-outs using machine identification codes. Pattern Anal. Appl. 16(4):663–678

    Article  MathSciNet  Google Scholar 

  • Vinay A, Kumar CA, Shenoy GR, Murthy NKB, Natarajan S (2015) ORB-PCA based feature extraction technique for face recognition. Proc. Comput. Sci. 58:614–621

    Article  Google Scholar 

  • Zhuo L, Cheng B, Zhang J (2014) A comparative study of dimensionality reduction methods for large-scale image retrieval. Neurocomputing 141:202–210

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Munish Kumar.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kumar, M., Gupta, S. & Mohan, N. A computational approach for printed document forensics using SURF and ORB features. Soft Comput 24, 13197–13208 (2020). https://doi.org/10.1007/s00500-020-04733-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-020-04733-x

Keywords

Navigation