Skip to main content

Analysis of the Scalability of a Deep-Learning Network for Steganography “Into the Wild”

  • Conference paper
  • First Online:
Pattern Recognition. ICPR International Workshops and Challenges (ICPR 2021)

Abstract

Since the emergence of deep learning and its adoption in steganalysis fields, most of the reference articles kept using small to medium size CNN, and learn them on relatively small databases.

Therefore, benchmarks and comparisons between different deep learning-based steganalysis algorithms, more precisely CNNs, are thus made on small to medium databases. This is performed without knowing:

  1. 1.

    if the ranking, with a criterion such as accuracy, is always the same when the database is larger,

  2. 2.

    if the efficiency of CNNs will collapse or not if the training database is a multiple of magnitude larger,

  3. 3.

    the minimum size required for a database or a CNN, in order to obtain a better result than a random guesser.

In this paper, after a solid discussion related to the observed behaviour of CNNs as a function of their sizes and the database size, we confirm that the error’s power law also stands in steganalysis, and this in a border case, i.e. with a medium-size network, on a big, constrained and very diverse database.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    See the paper [17], and the discussions here: https://openreview.net/forum?id=ryenvpEKDr.

  2. 2.

    The ResNet18 with a width = 10 is made of 300,000 parameters and is in the interpolation threshold region for experiments run on CIFAR-10 and CIFAR-100 in [16].

  3. 3.

    A too small database could bias the analysis since there is a region where the error increases when the dataset increase (see [16]). For example, in [23], we report that the number of images needed, for the medium size Yedroudj-Net [25], to reach a region of good performance for spatial steganalysis (that is the performance of a Rich Model with an Ensemble Classifier), is about 10,000 images (5,000 covers and 5,000 stegos) for the learning phase, in the case where there is no cover-source mismatch, and the image size is 256 \(\times \) 256 pixels.

  4. 4.

    The LSSD database is available at: http://www.lirmm.fr/~chaumont/LSSD.html.

  5. 5.

    Experiments with 10M images were disrupted by a maintenance of the platform and it took 22 day. Nevertheless, we rerun the experiment on a similar platform, without any disruption in learning, and the duration was only 10 day. So we report both values to be more precise.

  6. 6.

    The initial point for the non-linear regression is set to \(a'=0.5\), \(\alpha '=0.001\) and \(c'_\infty = 0.01\), with \(c'_\infty \) forced to be positive. The Matlab function is fconmin and the stop criterion is such that the mean of the sum of square error is under \(10^{-6}\).

References

  1. Abdulrahman, H., Chaumont, M., Montesinos, P., Magnier, B.: Color images steganalysis using RGB channel geometric transformation measures. Secur. Commun. Netw. 15, 2945–2956 (2016)

    Article  Google Scholar 

  2. Advani, M.S., Saxe, A.M., Sompolinsky, H.: High-dimensional dynamics of generalization error in neural networks. Neural Netw. 132, 428–446 (2020)

    Article  Google Scholar 

  3. Bas, P., Filler, T., Pevný, T.: “Break our steganographic system”: the ins and outs of organizing BOSS. In: Filler, T., Pevný, T., Craver, S., Ker, A. (eds.) IH 2011. LNCS, vol. 6958, pp. 59–70. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24178-9_5

    Chapter  Google Scholar 

  4. Belkin, M., Hsu, D., Ma, S., Mandal, S.: Reconciling modern machine-learning practice and the classical bias–variance trade-off. Proc. Nat. Acad. Sci. 32, 15849–15854 (2019)

    Article  MathSciNet  Google Scholar 

  5. Boroumand, M., Chen, M., Fridrich, J.: Deep residual network for steganalysis of digital images. IEEE Trans. Inf. Forensics Secur. 5, 1181–1193 (2019)

    Article  Google Scholar 

  6. Chaumont, M.: Deep Learning in steganography and steganalysis. In: Hassaballah, M. (ed.) Digital Media Steganography: Principles, Algorithms, Advances, chap. 14, pp. 321–349. Elsevier (July 2020)

    Google Scholar 

  7. Chubachi, K.: An ensemble model using CNNs on different domains for ALASKA2 image steganalysis. In: Proceedings of the IEEE International Workshop on Information Forensics and Security, WIFS 2020. Virtual Conference due to Covid, New-York, NY, USA, (December 2020)

    Google Scholar 

  8. Cogranne, R., Giboulot, Q., Bas, P.: The ALASKA Steganalysis Challenge: a first step towards steganalysis. In: Proceedings of the ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec 2019, pp. 125–137. Paris, France (July 2019)

    Google Scholar 

  9. Cogranne, R., Giboulot, Q., Bas, P.: Steganography by minimizing statistical detectability: the cases of JPEG and color images. In: Proceedings of the ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec 2020, pp. 161–167 (June 2020)

    Google Scholar 

  10. Cogranne, R., Giboulot, Q., Bas, P.: Challenge academic research on steganalysis with realistic images. In: Proceedings of the IEEE International Workshop on Information Forensics and Security, WIFS 2020. Virtual Conference due to Covid (Formerly New-York, NY, USA) (December 2020)

    Google Scholar 

  11. Fridrich, J.: Steganography in Digital Media. Cambridge University Press, New York (2009)

    Book  Google Scholar 

  12. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, pp. 770–778. Las Vegas, Nevada (June 2016)

    Google Scholar 

  13. Hestness, J., et al.: Deep Learning Scaling is Predictable, Empirically. In: Unpublished - ArXiv. vol. abs/1712.00409 (2017)

  14. Holub, V., Fridrich, J., Denemark, T.: Universal distortion function for steganography in an arbitrary domain. EURASIP J. Inf. Secur. 2014(1), 1–13 (2014). https://doi.org/10.1186/1687-417X-2014-1

    Article  Google Scholar 

  15. Huang, J., Ni, J., Wan, L., Yan, J.: A customized convolutional neural network with low model complexity for JPEG steganalysis. In: Proceedings of the ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec 2019, pp. 198–203. Paris, France (July 2019)

    Google Scholar 

  16. Nakkiran, P., Kaplun, G., Bansal, Y., Yang, T., Barak, B., Sutskever, I.: Deep double descent: where bigger models and more data hurt. In: Proceedings of the Eighth International Conference on Learning Representations, ICLR 2020. Virtual Conference due to Covid (Formerly Addis Ababa, Ethiopia) (April 2020)

    Google Scholar 

  17. Rosenfeld, J.S., Rosenfeld, A., Belinkov, Y., Shavit, N.: A constructive prediction of the generalization error across scales. In: Proceedings of the Eighth International Conference on Learning Representations, ICLR 2020. Virtual Conference due to Covid (Formerly Addis Ababa, Ethiopia (April 2020)

    Google Scholar 

  18. Ruiz, H., Yedroudj, M., Chaumont, M., Comby, F., Subsol, G.: LSSD: a controlled large JPEG image database for deep-learning-based Steganalysis into the Wild. In: Proceeding of the 25th International Conference on Pattern Recognition, ICPR 2021, Worshop on MultiMedia FORensics in the WILD, MMForWILD 2021, Lecture Notes in Computer Science, LNCS, Springer. Virtual Conference due to Covid (Formerly Milan, Italy) (January 2021). http://www.lirmm.fr/~chaumont/LSSD.html

  19. Sala, V.: Power law scaling of test error versus number of training images for deep convolutional neural networks. In: Proceedings of the multimodal sensing: technologies and applications. vol. 11059, pp. 296–300. International Society for Optics and Photonics, SPIE, Munich (2019)

    Google Scholar 

  20. Spigler, S., Geiger, M., d’Ascoli, S., Sagun, L., Biroli, G., Wyart, M.: A jamming transition from under-to over-parametrization affects generalization in deep learning. J. Phys. Math. Theor. 52(47), 474001 (2019)

    Article  MathSciNet  Google Scholar 

  21. Tan, M., Le, Q.: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In: Proceedings of the 36th International Conference on Machine Learning, PMLR 2019, vol. 97, pp. 6105–6114. Long Beach, California, USA (June 2019)

    Google Scholar 

  22. Ye, J., Ni, J., Yi, Y.: Deep learning hierarchical representations for image steganalysis. IEEE Trans. Inf. Forensics Secur. TIFS 11, 2545–2557 (2017)

    Article  Google Scholar 

  23. Yedroudj, M., Chaumont, M., Comby, F.: How to augment a small learning set for improving the performances of a CNN-based Steganalyzer? In: Proceedings of Media Watermarking, Security, and Forensics, MWSF 2018, Part of IS&T International Symposium on Electronic Imaging, EI 2018. p. 7. Burlingame, California, USA (28 January–2 February 2018)

    Google Scholar 

  24. Yedroudj, M., Chaumont, M., Comby, F., Oulad Amara, A., Bas, P.: Pixels-off: data-augmentation complementary solution for deep-learning steganalysis. In: Proceedings of the 2020 ACM Workshop on Information Hiding and Multimedia Security. p. 39–48. IHMSec 2020, Virtual Conference due to Covid (Formerly Denver, CO, USA) (June 2020)

    Google Scholar 

  25. Yedroudj, M., Comby, F., Chaumont, M.: Yedrouj-Net: an efficient CNN for spatial steganalysis. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2018, pp. 2092–2096. Calgary, Alberta, Canada (April 2018)

    Google Scholar 

  26. Yousfi, Y., Butora, J., Fridrich, J., Giboulot, Q.: Breaking ALASKA: color separation for steganalysis in JPEG domain. In: Proceedings of the ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec 2019, pp. 138–149. Paris, France (July 2019)

    Google Scholar 

  27. Yousfi, Y., Butora, J., Khvedchenya, E., Fridrich, J.: ImageNet pre-trained CNNs for JPEG steganalysis. In: Proceedings of the IEEE International Workshop on Information Forensics and Security, WIFS 2020. Virtual Conference due to Covid (Formerly New-York, NY, USA) (December 2020)

    Google Scholar 

  28. Yousfi, Y., Fridrich, J.: JPEG steganalysis detectors scalable with respect to compression quality. In: Proceedings of Media Watermarking, Security, and Forensics, MWSF 2020, Part of IS&T International Symposium on Electronic Imaging, EI 2020, p. 10. Burlingame, California, USA (January 2020)

    Google Scholar 

  29. Zeng, J., Tan, S., Liu, G., Li, B., Huang, J.: WISERNet: wider separate-then-reunion network for steganalysis of color images. IEEE Trans. Inf. Forensics Secur. 10, 2735–2748 (2019)

    Article  Google Scholar 

Download references

Acknowledgment

The authors would like to thank the French Defense Procurement Agency (DGA) for its support through the ANR Alaska project (ANR-18-ASTR-0009). We also thank IBM Montpellier and the Institute for Development and Resources in Intensive Scientific Computing (IDRISS/CNRS) for providing us access to High-Performance Computing resources.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marc Chaumont .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ruiz, H., Chaumont, M., Yedroudj, M., Amara, A.O., Comby, F., Subsol, G. (2021). Analysis of the Scalability of a Deep-Learning Network for Steganography “Into the Wild”. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12666. Springer, Cham. https://doi.org/10.1007/978-3-030-68780-9_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-68780-9_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-68779-3

  • Online ISBN: 978-3-030-68780-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics