Skip to main content

Even big data is not enough: need for a novel reference modelling for forensic document authentication

Abstract

With the emergence of big data, deep learning (DL) approaches are becoming quite popular in many branches of science. Forensic science is no longer an exception. However, there are certain problems in forensic science where the solutions would hardly benefit from the recent advances in DL algorithms. Document authentication is one such problem where we can have many reference samples, and with the big data scenario probably we would have even more number of reference samples but number of defective or forged samples will remain an issue. Experts often encounter situations where there is no or hardly a scanty number of forged samples available. In such situation, employment of data-hungry algorithms would be inefficient as they will not be able to learn the forged samples properly. This paper addresses this problem and proposes a novel reference modelling framework for forensic document authentication. The approach is based on Mahalanobis space. Two questioned document examination problems have been studied to show the effectiveness of our reference modelling algorithm which has also been compared to a commonly used learning approach, namely neural network-based classification.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

References

  1. 1.

    Abdi, L., Hashemi, S.: To combat multi-class imbalanced problems by means of over-sampling techniques. IEEE Trans. Knowl. Data Eng. 28, 238–251 (2016)

    Article  Google Scholar 

  2. 2.

    Artaud, C., Sidere, N., Doucet, A., Ogier, J., D’Andecy, V.: Find it! Fraud detection contest report. In: Proceedings ICPR, pp. 13–18 (2018)

  3. 3.

    Baldi, P.: Autoencoders. Unsupervised learning, and deep architectures. In: Workshop on Unsupervised and Transfer Learning, JMLR: Workshop and Conference Proceedings Vol. 27, pp. 37–50 (2012)

  4. 4.

    Centeno, A., Terrades, O., Lladós, J., Morales, C.: Evaluation of texture descriptors for validation of counterfeit documents. In: Proceedings ICDAR, pp. 1237–1242 (2017)

  5. 5.

    Chambers, J., Yan, W., Garhwal, A., Kankanhalli, M.: Currency security and forensics: a survey. Multimed. Tools Appl. 74(11), 4013–4043 (2015)

    Article  Google Scholar 

  6. 6.

    Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 1–58 (2009)

    Article  Google Scholar 

  7. 7.

    Cozzolino, D., Poggi, G., Verdoliva, L.: Efficient dense-field copy-move forgery detection. IEEE Trans. Inf. Forensics Secur. 10(11), 2284–2297 (2015)

    Article  Google Scholar 

  8. 8.

    Cruz, F., Sidere, N., Coustaty, M., Poulain d’Andecy, V., Ogier, J.: Local binary patterns for document forgery detection. In: Proceedings ICDAR, pp. 1223–1228 (2017)

  9. 9.

    Cruz, F., Sidère, N., Coustaty, M., Poulain d’Andecy, V., Ogier, J.M.: Categorization of document image tampering techniques and how to identify them. In: 7th IAPR International Workshop on Computational Forensics, Proceedings ICPR (2018)

  10. 10.

    Cudney, E.A., Drain, D., Paryani, K., Sharma, N.: A comparison of the Mahalanobis-Taguchi system to a standard statistical method for defect detection. J. Ind. Syst. Eng. 2(4), 250–258 (2009)

    Google Scholar 

  11. 11.

    Duda, R.O., Hart, P.E.: Use of the Hough transformation to detect lines and curves in pictures. Commun. ACM 15, 11–15 (1972)

    MATH  Article  Google Scholar 

  12. 12.

    Garain, U., Halder, B.: On automatic authenticity verification of printed security documents. In: Proceedings of Indian Conference on Computer Vision, Graphics and Image processing (ICVGIP), Bhubaneswar, India, pp. 706–713 (2008)

  13. 13.

    Garain, U., Halder, B.: Machine authentication of security documents. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR), Bacelona, Spain, pp. 718–722 (2009)

  14. 14.

    Geradts, Z., Franke, K. (Eds.): Special issue: big data and intelligent data analysis. Digit. Investig. 15, 1–124 (2015)

  15. 15.

    Girard, N., Trullo, R., Barrat, S., Ragot, N., Ramel, J.: Interactive definition and tuning of one-class classifiers for document image classification. In: Proceedings of 12th IAPR Workshop on Document Analysis Systems (DAS) (2016). https://doi.org/10.1109/DAS.2016.46

  16. 16.

    Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial networks. In: Proceedings of International Conference on Neural Information Processing Systems (NIPS), pp. 2672–2680 (2014)

  17. 17.

    Halder, B., Garain, U.: Color feature based approach for determining ink age in printed documents. In: Proceedings of International Conference on Patter Recognition (ICPR), Istanbul, Turkey, pp. 3212–3215 (2010)

  18. 18.

    Halder, B., Darbar, R., Garain, U., Mondal, A.C.: Analysis of fluorescent paper pulps for detecting counterfeit Indian paper money. In: Proceedings of 10th International Conference on Information Systems Security (ICISS). Hyderabad, India, pp. 411–424 (2014)

    Google Scholar 

  19. 19.

    Haralick, R.M., Shanmugam, K.S., Dinstein, I.: Textural features for image classification. IEEE Trans. Syst. Man Cybern. 3(6), 610–621 (1973)

    Article  Google Scholar 

  20. 20.

    Harrison, W.R.: Suspect Documents: Their Scientific Examination. Praeger, New York (1958)

    Google Scholar 

  21. 21.

    Hilton, O.: Scientific Examination of Questioned Documents. Elsevier Science Publishing Co., New York (1982)

    Google Scholar 

  22. 22.

    Huang, C., Li, Y., Change Loy, C., Tang, X.: Learning deep representation for imbalanced classification. In: CVPR (2016)

  23. 23.

    Khan, S.H., Hayat, M., Bennamoun, M., Sohel, F.A., Togneri, R.: Cost-sensitive learning of deep feature representations from imbalanced data. IEEE TNNLS 29(8), 3573–3587 (2018)

    Google Scholar 

  24. 24.

    Lampert, C.H., Mei, L., Breuel, T.M.: Printing technique classification for document counterfeit detection. In: Proceedings of International Conference on Computational Intelligence and Security, pp. 639–644 (2006)

  25. 25.

    Mahalanobis, P.C.: On the generalised distance in statistics. J. Multimed. Proc. Natl. Inst. Sci. India 2(1), 49–55 (1936)

    MathSciNet  MATH  Google Scholar 

  26. 26.

    Mena, J.: Machine Learning Forensics for Law Enforcement, Security, and Intelligence. Auerbach Publications, Boca Raton (2011)

    Google Scholar 

  27. 27.

    Mikkilineni, K., Chiang, P.J., Ali, G.N., Chiu, G.T.C., Allebach, J.P., Delp, E.J.: Printer identification based on graylevel co-occurrence features for security and forensic applications. In: Proceedings of the SPIE 7th International Conference on Security, Steganography and Watermarking of Multimedia Contents, Vol. 5681, pp. 430–440 (2005)

  28. 28.

    Mitchell, F.: The use of artificial intelligence in digital forensics: an introduction. Digit. Evid. Electron. Signat. Law Rev. 7, 35–41 (2010)

    Google Scholar 

  29. 29.

    Moya, M., Hush, D.: Network constraints and multi-objective optimization for one-class classification. Neural Netw. 9(3), 463–474 (1996)

    Article  Google Scholar 

  30. 30.

    Osborn, A.S.: Questioned Documents, 2nd edn. Boyd Printing Company, Albany, NY (1929)

    Google Scholar 

  31. 31.

    Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. 9(1), 62–66 (1979)

    Google Scholar 

  32. 32.

    Raudys, S.J., Jain, A.K.: Small sample size effects in statistical pattern recognition: recommendations for practitioners. IEEE Trans. Pattern Anal. Mach. Intell. 3, 252–264 (1991)

    Article  Google Scholar 

  33. 33.

    Roy, A., Halder, B., Garain, U.: Authentication of currency notes through printing technique verification. In: Proceedings of ACM, Indian Conference on Computer Vision, Graphics and Image processing (ICVGIP), Chennai, India, pp. 383–390 (2010)

  34. 34.

    Roy, A., Halder, B., Garain, U., Doermann, D.: Machine-assisted authentication of paper currency: an experiment on Indian banknotes. Springer. Int. J. Doc. Anal. Recognit. (IJDAR) 18(3), 271–285 (2015)

    Article  Google Scholar 

  35. 35.

    Taguchi, G., Rajesh, J.: New trends in multivariate diagnosis. Sankhya Indian J. Stat. Ser. B 62(2), 233–248 (2000)

    MathSciNet  MATH  Google Scholar 

  36. 36.

    Taguchi, G., Jugulam, R.: The Mahalanobis Taguchi Strategy: A Pattern Technology System. Wiley, New York (2002)

    Book  Google Scholar 

  37. 37.

    Thompson, T.: Growing societal impact of digital forensics and incident response. Digit. Investig. 11(1), 1–2 (2014)

    Article  Google Scholar 

  38. 38.

    Visual Spectral Comparator 5000 (VSC 5000). http://crimesight.co.za

  39. 39.

    Verdoliva, L.: Handbook of digital forensics of multimedia data and devices [book reviews]. IEEE Signal Process. Mag. 33(1), 164–165 (2016)

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Utpal Garain.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Garain, U., Halder, B. Even big data is not enough: need for a novel reference modelling for forensic document authentication. IJDAR 23, 1–11 (2020). https://doi.org/10.1007/s10032-019-00345-w

Download citation