Abstract
For many years, the image databases used in steganalysis have been relatively small, i.e. about ten thousand images. This limits the diversity of images and thus prevents large-scale analysis of steganalysis algorithms.
In this paper, we describe a large JPEG database composed of 2 million colour and grey-scale images. This database, named LSSD for Large Scale Steganalysis Database, was obtained thanks to the intensive use of “controlled” development procedures. LSSD has been made publicly available, and we aspire it could be used by the steganalysis community for large-scale experiments.
We introduce the pipeline used for building various image database versions. We detail the general methodology that can be used to redevelop the entire database and increase even more the diversity. We also discuss computational cost and storage cost in order to develop images.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Website of the ALASKA challenge#2: https://alaska.utt.fr/ Download page: http://alaska.utt.fr/ALASKA_v2_RAWs_scripts.zip.
- 2.
Challenge BOSS: http://agents.fel.cvut.cz/boss/index.php?mode=VIEW&tmpl=about Download page: ftp://mas22.felk.cvut.cz/RAWs .
- 3.
Obsolete download link http://mmlab.science.unitn.it/RAISE/.
- 4.
- 5.
Site closed on February 17, 2020: http://wesaturate.com/.
- 6.
- 7.
Documentation: https://pillow.readthedocs.io/en/stable/.
- 8.
Software available at: http://rawtherapee.com More information can be found at: http://rawpedia.rawtherapee.com.
- 9.
Documentation about the different mosaicking methods of Rawtherapee can be found at: https://rawpedia.rawtherapee.com/Demosaicing.
References
Bas, P., Filler, T., Pevný, T.: Break our steganographic system: the ins and outs of organizing BOSS. In: Filler, T., Pevny, T., Craver, S., Ker, A. (eds.) IH 2011. LNCS, vol. 6958, pp. 59–70. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24178-9_5
Chaumont, M.: Deep Learning in steganography and steganalysis. In: Hassaballah, M. (ed.) Digital Media Steganography: Principles, Algorithms, Advances, chap. 14, pp. 321–349. Elsevier, July 2020
Chubachi, K.: An Ensemble Model using CNNs on Different Domains for ALASKA2 Image Steganalysis. In: Proceedings of the IEEE International Workshop on Information Forensics and Security, WIFS 2020. Virtual Conference due to Covid (Formerly New-York, NY, USA), December 2020
Cogranne, R., Giboulot, Q., Bas, P.: The ALASKA steganalysis challenge: a first step towards steganalysis. In: Proceedings of the ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec 2019, pp. 125–137. Paris, France, July 2019
Cogranne, R., Giboulot, Q., Bas, P.: Challenge academic research on steganalysis with realistic images. In: Proceedings of the IEEE International Workshop on Information Forensics and Security, WIFS 2020. Virtual Conference due to Covid (Formerly New-York, NY, USA), December 2020
Dang-Nguyen, D.T., Pasquini, C., Conotter, V., Boato, G.: RAISE - a raw images dataset for digital image forensics. In: Proceedings of ACM Multimedia Systems, Portland, Oregon, March 2015
Deng, J., et al.: ImageNet: a large-scale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 248–255 (2009)
Fridrich, J.: Steganography in Digital Media. Cambridge University Press, New York (2009)
Giboulot, Q., Cogranne, R., Borghys, D., Bas, P.: Effects and solutions of cover-source mismatch in image steganalysis. Signal Proc. Image Commun. 86, 115888 (2020)
Gloe, T., Böhme, R.: The ‘Dresden image database’ for benchmarking digital image forensics. In: Proceedings of the 25th Symposium On Applied Computing (ACM SAC 2010), vol. 2, pp. 1585–1591 (2010)
Holub, V., Fridrich, J., Denemark, T.: Universal distortion function for steganography in an arbitrary domain. EURASIP J. Inf. Secur. 2014(1), 1–13 (2014). https://doi.org/10.1186/1687-417X-2014-1
Ker, A.D., et al.: Moving steganography and steganalysis from the laboratory into the real world. In: Proceedings of the 1st ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec 2013, pp. 45–58. Montpellier, France, June 2013
Menon, D., Calvagno, G.: Color image demosaicking: an overview. Signal Proc. Image Commun. 8, 518–533 (2011)
Newman, J., et al.: StegoAppDB: a steganography apps forensics image database. In: Proceedings of Media Watermarking, Security, and Forensics, MWSF 2019, Part of IS&T International Symposium on Electronic Imaging, EI 2019. Ingenta, Burlingame, California, USA, January 2019
Ruiz, H., Chaumont, M., Yedroudj, M., Oulad-Amara, A., Comby, F., Subsol, G.: Analysis of the scalability of a deep-learning network for steganography “Into the Wild”. In: Proceeding of the 25th International Conference on Pattern Recognition, ICPR 2021, Worshop on MultiMedia FORensics in the WILD, MMForWILD 2021, Lecture Notes in Computer Science, LNCS, Springer. Virtual Conference due to Covid (Formerly Milan, Italy), January 2021. http://www.lirmm.fr/~chaumont/LSSD.html
Yedroudj, M., Chaumont, M., Comby, F.: How to augment a small learning set for improving the performances of a CNN-based steganalyzer? In: Proceedings of Media Watermarking, Security, and Forensics, MWSF 2018, Part of IS&T International Symposium on Electronic Imaging, EI 2018. p. 7. Burlingame, California, USA, 28 January–2 February 2018
Yousfi, Y., Butora, J., Fridrich, J., Giboulot, Q.: Breaking ALASKA: color separation for steganalysis in jpeg domain. In: Proceedings of the ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec 2019, pp. 138–149. Paris, France, July 2019
Yousfi, Y., Butora, J., Khvedchenya, E., Fridrich, J.: ImageNet pre-trained CNNs for JPEG steganalysis. In: Proceedings of the IEEE International Workshop on Information Forensics and Security, WIFS 2020. Virtual Conference due to Covid (Formerly New-York, NY, USA), December 2020
Yousfi, Y., Fridrich, J.: JPEG steganalysis detectors scalable with respect to compression quality. In: Proceedings of Media Watermarking, Security, and Forensics, MWSF 2020, Part of IS&T International Symposium on Electronic Imaging, EI 2020, p. 10. Burlingame, California, USA, January 2020
Zeng, J., Tan, S., Li, B., Huang, J.: Large-scale jpeg image steganalysis using hybrid deep-learning framework. IEEE Trans. Inf. Forensics Secur. 5, 1200–1214 (2018)
Acknowledgment
The authors would like to thank the French Defense Procurement Agency (DGA) for its support through the ANR Alaska project (ANR-18-ASTR-0009). We also thank IBM Montpellier and the Institute for Development and Resources in Intensive Scientific Computing (IDRISS/CNRS) for providing us access to High-Performance Computing resources.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Ruiz, H., Yedroudj, M., Chaumont, M., Comby, F., Subsol, G. (2021). LSSD: A Controlled Large JPEG Image Database for Deep-Learning-Based Steganalysis “Into the Wild”. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12666. Springer, Cham. https://doi.org/10.1007/978-3-030-68780-9_38
Download citation
DOI: https://doi.org/10.1007/978-3-030-68780-9_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68779-3
Online ISBN: 978-3-030-68780-9
eBook Packages: Computer ScienceComputer Science (R0)