Analysis of the Scalability of a Deep-Learning Network for Steganography “Into the Wild”

Ruiz, Hugo; Chaumont, Marc; Yedroudj, Mehdi; Amara, Ahmed Oulad; Comby, Frédéric; Subsol, Gérard

doi:10.1007/978-3-030-68780-9_36

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12666))

Included in the following conference series:

International Conference on Pattern Recognition

2254 Accesses
7 Citations

Abstract

Since the emergence of deep learning and its adoption in steganalysis fields, most of the reference articles kept using small to medium size CNN, and learn them on relatively small databases.

Therefore, benchmarks and comparisons between different deep learning-based steganalysis algorithms, more precisely CNNs, are thus made on small to medium databases. This is performed without knowing:

1.
if the ranking, with a criterion such as accuracy, is always the same when the database is larger,
2.
if the efficiency of CNNs will collapse or not if the training database is a multiple of magnitude larger,
3.
the minimum size required for a database or a CNN, in order to obtain a better result than a random guesser.

In this paper, after a solid discussion related to the observed behaviour of CNNs as a function of their sizes and the database size, we confirm that the error’s power law also stands in steganalysis, and this in a border case, i.e. with a medium-size network, on a big, constrained and very diverse database.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
See the paper [17], and the discussions here: https://openreview.net/forum?id=ryenvpEKDr.
2.
The ResNet18 with a width = 10 is made of 300,000 parameters and is in the interpolation threshold region for experiments run on CIFAR-10 and CIFAR-100 in [16].
3.
A too small database could bias the analysis since there is a region where the error increases when the dataset increase (see [16]). For example, in [23], we report that the number of images needed, for the medium size Yedroudj-Net [25], to reach a region of good performance for spatial steganalysis (that is the performance of a Rich Model with an Ensemble Classifier), is about 10,000 images (5,000 covers and 5,000 stegos) for the learning phase, in the case where there is no cover-source mismatch, and the image size is 256 \(\times \) 256 pixels.
4.
The LSSD database is available at: http://www.lirmm.fr/~chaumont/LSSD.html.
5.
Experiments with 10M images were disrupted by a maintenance of the platform and it took 22 day. Nevertheless, we rerun the experiment on a similar platform, without any disruption in learning, and the duration was only 10 day. So we report both values to be more precise.
6.
The initial point for the non-linear regression is set to \(a'=0.5\), \(\alpha '=0.001\) and \(c'_\infty = 0.01\), with \(c'_\infty \) forced to be positive. The Matlab function is fconmin and the stop criterion is such that the mean of the sum of square error is under \(10^{-6}\).

References

Abdulrahman, H., Chaumont, M., Montesinos, P., Magnier, B.: Color images steganalysis using RGB channel geometric transformation measures. Secur. Commun. Netw. 15, 2945–2956 (2016)
Article Google Scholar
Advani, M.S., Saxe, A.M., Sompolinsky, H.: High-dimensional dynamics of generalization error in neural networks. Neural Netw. 132, 428–446 (2020)
Article Google Scholar
Bas, P., Filler, T., Pevný, T.: “Break our steganographic system”: the ins and outs of organizing BOSS. In: Filler, T., Pevný, T., Craver, S., Ker, A. (eds.) IH 2011. LNCS, vol. 6958, pp. 59–70. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24178-9_5
Chapter Google Scholar
Belkin, M., Hsu, D., Ma, S., Mandal, S.: Reconciling modern machine-learning practice and the classical bias–variance trade-off. Proc. Nat. Acad. Sci. 32, 15849–15854 (2019)
Article MathSciNet Google Scholar
Boroumand, M., Chen, M., Fridrich, J.: Deep residual network for steganalysis of digital images. IEEE Trans. Inf. Forensics Secur. 5, 1181–1193 (2019)
Article Google Scholar
Chaumont, M.: Deep Learning in steganography and steganalysis. In: Hassaballah, M. (ed.) Digital Media Steganography: Principles, Algorithms, Advances, chap. 14, pp. 321–349. Elsevier (July 2020)
Google Scholar
Chubachi, K.: An ensemble model using CNNs on different domains for ALASKA2 image steganalysis. In: Proceedings of the IEEE International Workshop on Information Forensics and Security, WIFS 2020. Virtual Conference due to Covid, New-York, NY, USA, (December 2020)
Google Scholar
Cogranne, R., Giboulot, Q., Bas, P.: The ALASKA Steganalysis Challenge: a first step towards steganalysis. In: Proceedings of the ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec 2019, pp. 125–137. Paris, France (July 2019)
Google Scholar
Cogranne, R., Giboulot, Q., Bas, P.: Steganography by minimizing statistical detectability: the cases of JPEG and color images. In: Proceedings of the ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec 2020, pp. 161–167 (June 2020)
Google Scholar
Cogranne, R., Giboulot, Q., Bas, P.: Challenge academic research on steganalysis with realistic images. In: Proceedings of the IEEE International Workshop on Information Forensics and Security, WIFS 2020. Virtual Conference due to Covid (Formerly New-York, NY, USA) (December 2020)
Google Scholar
Fridrich, J.: Steganography in Digital Media. Cambridge University Press, New York (2009)
Book Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, pp. 770–778. Las Vegas, Nevada (June 2016)
Google Scholar
Hestness, J., et al.: Deep Learning Scaling is Predictable, Empirically. In: Unpublished - ArXiv. vol. abs/1712.00409 (2017)
Holub, V., Fridrich, J., Denemark, T.: Universal distortion function for steganography in an arbitrary domain. EURASIP J. Inf. Secur. 2014(1), 1–13 (2014). https://doi.org/10.1186/1687-417X-2014-1
Article Google Scholar
Huang, J., Ni, J., Wan, L., Yan, J.: A customized convolutional neural network with low model complexity for JPEG steganalysis. In: Proceedings of the ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec 2019, pp. 198–203. Paris, France (July 2019)
Google Scholar
Nakkiran, P., Kaplun, G., Bansal, Y., Yang, T., Barak, B., Sutskever, I.: Deep double descent: where bigger models and more data hurt. In: Proceedings of the Eighth International Conference on Learning Representations, ICLR 2020. Virtual Conference due to Covid (Formerly Addis Ababa, Ethiopia) (April 2020)
Google Scholar
Rosenfeld, J.S., Rosenfeld, A., Belinkov, Y., Shavit, N.: A constructive prediction of the generalization error across scales. In: Proceedings of the Eighth International Conference on Learning Representations, ICLR 2020. Virtual Conference due to Covid (Formerly Addis Ababa, Ethiopia (April 2020)
Google Scholar
Ruiz, H., Yedroudj, M., Chaumont, M., Comby, F., Subsol, G.: LSSD: a controlled large JPEG image database for deep-learning-based Steganalysis into the Wild. In: Proceeding of the 25th International Conference on Pattern Recognition, ICPR 2021, Worshop on MultiMedia FORensics in the WILD, MMForWILD 2021, Lecture Notes in Computer Science, LNCS, Springer. Virtual Conference due to Covid (Formerly Milan, Italy) (January 2021). http://www.lirmm.fr/~chaumont/LSSD.html
Sala, V.: Power law scaling of test error versus number of training images for deep convolutional neural networks. In: Proceedings of the multimodal sensing: technologies and applications. vol. 11059, pp. 296–300. International Society for Optics and Photonics, SPIE, Munich (2019)
Google Scholar
Spigler, S., Geiger, M., d’Ascoli, S., Sagun, L., Biroli, G., Wyart, M.: A jamming transition from under-to over-parametrization affects generalization in deep learning. J. Phys. Math. Theor. 52(47), 474001 (2019)
Article MathSciNet Google Scholar
Tan, M., Le, Q.: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In: Proceedings of the 36th International Conference on Machine Learning, PMLR 2019, vol. 97, pp. 6105–6114. Long Beach, California, USA (June 2019)
Google Scholar
Ye, J., Ni, J., Yi, Y.: Deep learning hierarchical representations for image steganalysis. IEEE Trans. Inf. Forensics Secur. TIFS 11, 2545–2557 (2017)
Article Google Scholar
Yedroudj, M., Chaumont, M., Comby, F.: How to augment a small learning set for improving the performances of a CNN-based Steganalyzer? In: Proceedings of Media Watermarking, Security, and Forensics, MWSF 2018, Part of IS&T International Symposium on Electronic Imaging, EI 2018. p. 7. Burlingame, California, USA (28 January–2 February 2018)
Google Scholar
Yedroudj, M., Chaumont, M., Comby, F., Oulad Amara, A., Bas, P.: Pixels-off: data-augmentation complementary solution for deep-learning steganalysis. In: Proceedings of the 2020 ACM Workshop on Information Hiding and Multimedia Security. p. 39–48. IHMSec 2020, Virtual Conference due to Covid (Formerly Denver, CO, USA) (June 2020)
Google Scholar
Yedroudj, M., Comby, F., Chaumont, M.: Yedrouj-Net: an efficient CNN for spatial steganalysis. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2018, pp. 2092–2096. Calgary, Alberta, Canada (April 2018)
Google Scholar
Yousfi, Y., Butora, J., Fridrich, J., Giboulot, Q.: Breaking ALASKA: color separation for steganalysis in JPEG domain. In: Proceedings of the ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec 2019, pp. 138–149. Paris, France (July 2019)
Google Scholar
Yousfi, Y., Butora, J., Khvedchenya, E., Fridrich, J.: ImageNet pre-trained CNNs for JPEG steganalysis. In: Proceedings of the IEEE International Workshop on Information Forensics and Security, WIFS 2020. Virtual Conference due to Covid (Formerly New-York, NY, USA) (December 2020)
Google Scholar
Yousfi, Y., Fridrich, J.: JPEG steganalysis detectors scalable with respect to compression quality. In: Proceedings of Media Watermarking, Security, and Forensics, MWSF 2020, Part of IS&T International Symposium on Electronic Imaging, EI 2020, p. 10. Burlingame, California, USA (January 2020)
Google Scholar
Zeng, J., Tan, S., Liu, G., Li, B., Huang, J.: WISERNet: wider separate-then-reunion network for steganalysis of color images. IEEE Trans. Inf. Forensics Secur. 10, 2735–2748 (2019)
Article Google Scholar

Download references

Acknowledgment

The authors would like to thank the French Defense Procurement Agency (DGA) for its support through the ANR Alaska project (ANR-18-ASTR-0009). We also thank IBM Montpellier and the Institute for Development and Resources in Intensive Scientific Computing (IDRISS/CNRS) for providing us access to High-Performance Computing resources.

Author information

Authors and Affiliations

Research-Team ICAR, LIRMM, Université Montpellier, CNRS, Montpellier, France
Hugo Ruiz, Marc Chaumont, Mehdi Yedroudj, Ahmed Oulad Amara, Frédéric Comby & Gérard Subsol
University of Nîmes, Nîmes, France
Marc Chaumont

Authors

Hugo Ruiz
View author publications
You can also search for this author in PubMed Google Scholar
Marc Chaumont
View author publications
You can also search for this author in PubMed Google Scholar
Mehdi Yedroudj
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed Oulad Amara
View author publications
You can also search for this author in PubMed Google Scholar
Frédéric Comby
View author publications
You can also search for this author in PubMed Google Scholar
Gérard Subsol
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marc Chaumont .

Editor information

Editors and Affiliations

Dipartimento di Ingegneria dell’Informazione, University of Firenze, Firenze, Italy
Alberto Del Bimbo
Dipartimento di Ingegneria “Enzo Ferrari”, Università di Modena e Reggio Emilia, Modena, Italy
Rita Cucchiara
Department of Computer Science, Boston University, Boston, MA, USA
Stan Sclaroff
Dipartimento di Matematica e Informatica, University of Catania, Catania, Italy
Giovanni Maria Farinella
Cloud & AI, JD.COM, Beijing, China
Tao Mei
Dipartimento di Ingegneria dell’Informazione, University of Firenze, Firenze, Italy
Marco Bertini
Computational Sciences Department, National Institute of Astrophysics, Optics and Electronics (INAOE), Tonantzintla, Puebla, Mexico
Hugo Jair Escalante
Dipartimento di Ingegneria “Enzo Ferrari”, Università di Modena e Reggio Emilia, Modena, Italy
Roberto Vezzani

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ruiz, H., Chaumont, M., Yedroudj, M., Amara, A.O., Comby, F., Subsol, G. (2021). Analysis of the Scalability of a Deep-Learning Network for Steganography “Into the Wild”. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12666. Springer, Cham. https://doi.org/10.1007/978-3-030-68780-9_36

Download citation

DOI: https://doi.org/10.1007/978-3-030-68780-9_36
Published: 25 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68779-3
Online ISBN: 978-3-030-68780-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)