Abstract
The fast evolution of Information and Digital technology had given way for internet to be an effective medium for communication. This has also paved way for data exploitation. Therefore, users must protect their data from misuse. This led to the emergence of security framework like Information Hiding. Steganography and Steganalysis are of the two primary techniques in the field of Information Hiding. Steganography is the science of concealing confidential information, while steganalysis is the art of detecting the existence of that information. The primary goal of this research is to address the general concept of steganalysis, and various breaches associated with it. It involves a blind statistical steganalysis technique that is led in Joint Photographic Experts Group (JPEG) text embedded images by extracting features that illustrate an alteration during an embedding. The images used as embedding medium are uncalibrated and the percentage of the embedding used in this study is 50%. The text embedding is done using various steganographic schemes in the spatial and transform domain. The steganographic schemes considered are Least Significant Bit (LSB) Matching, Least Significant Bit (LSB) Replacement, Pixel Value Differencing and F5. After steganographic embedding of the data, the first order, second order, extended Discrete Cosine Transform (DCT) and Markov features are extracted. Then, Principal Component Analysis (PCA) is used as a system for feature dimensionality reduction. Furthermore, the technique of machine learning is incorporated by means of a classifier to identify the stego image and cover image. Support Vector Machine (SVM) and Support Vector Machine with Particle Swarm Optimization (SVM-PSO) are the classifiers examined in this paper for a comparative study. Moreover, the concept of cross-validation is also incorporated in this work. Six dissimilar kernel functions and four diverse samplings are used during classification to check on the effectiveness of the kernels and sampling in classification.
Similar content being viewed by others
Data Availability
Data and material can be made available upon request.
References
Aditit S (2016) Security and Information Hiding based on DNA steganography. Int J Comput Sci Mobile Comput 5(3):827–832
Al-Omari ZY, Al-Taani AT (2017) Secure LSB steganography for colored images using character-color mapping. In: 2017 8th International Conference on Information and Communication Systems (ICICS). IEEE, pp 104–110
Ammu PK, Sivakumar KC, Rejimoan R (2013) Biogeography-based optimization-a survey. Int J Electron Comput Sci Eng 2(1):154–160
Avcibas I, Memon N, Sankur B (2003) Steganalysis using image quality metrics. IEEE Tran Image Process 12(2):221–229
Bas TDP (2018) Natural steganography in JPEG Compressed images. In: Proc IS&T, Electronic Imaging, Media Watermarking, Security, and Forensics, San Francisco
Bhasin V, Bedi P (2013) Steganalysis for JPEG images using extreme learning machine. In: 2013 IEEE International Conference on Systems, Man, and Cybernetics. IEEE, pp 1361–1366
Briffa JA, Schaathun HG, Wahab AWA (2009) Has F5 really been broken?
Castelli (2019) Supervised learning: classification. Encycl Bioinforma Comput Bio: 342–349
Chaeikar SS, Ahmadi A (2019) Ensemble SW image steganalysis: a low dimension method for LSBR detection. Signal Process Image Commun 70:233–245. https://doi.org/10.1016/j.image.2018.10.004
Chen GY, Bhattacharya P (2006) Function dot product kernels for support vector machine. In: 18th International Conference on Pattern Recognition (ICPR'06). IEEE, vol 2, pp 614–617. https://doi.org/10.1109/ICPR.2006.586
Chen Y, He F, Li H, Zhang D, Wu Y (2020) A full migration BBO algorithm with enhanced population quality bounds for multimodal biomedical image registration. Appl Soft Comput 93:106335. https://doi.org/10.1016/j.asoc.2020.106335
Cho S Wang J Kuo C-CJ, Cha B-H (2010) Block-based image steganalysis for a multi-classifier. In: IEEE International Conference on Multimedia and Expo
Cho S et al (2010) Block-based image steganalysis: Algorithm and performance evaluation. J Vis Commun Image Represent 24:1679–1682. https://doi.org/10.1109/ISCAS.2010.5537499
Cho S, Cha BH, Wang J, Kuo CCJ (2011) Performance study on block-based image steganalysis. In: 2011 IEEE International Symposium of Circuits and Systems (ISCAS). IEEE, pp 2649–2652. https://doi.org/10.1109/ISCAS.2011.5938149
Da-Chun TW-HW (2003) A steganographic method for images by pixel value differencing. Pattern RecognLett 24:1613–1626
Demidova ENL (2016) The SVM classifier based on the modified particle swarm optimization. Int J Adv Comput Sci Appl
Du Jinglin et al (2017) A Prediction of Precipitation Data Based on Support Vector Machine and Particle Swarm Optimization (PSO-SVM) Algorithms. Algorithms 10:57. https://doi.org/10.3390/a10020057
Edwards B, Hofmeyr S, Forrest S (2016) Hype and heavy tails: a closer look at data breaches. J Cybersecurity 2(1):3–14
Fahmi S, Purnamawati L, Shidik GF, Muljono M, Fanani AZ (2020) Sentiment analysis of student review in learning management system based on sastrawi stemmer and SVM-PSO. In: 2020 International Seminar on Application for Technology of Information and Communication (iSemantic). IEEE, pp 643–648.https://doi.org/10.1109/iSemantic50169.2020.9234291
Foody GM, Mathur A (2004) A relative evaluation of multiclass image classification by support vector machines. IEEE Trans Geosci Remote Sens
Fridrich J (2009) Steganography in digital media: principles, algorithms, and applications. Cambridge University Press
Fridrich J, Goljan M, Hogea D (2002) Steganalysis of JPEG images: Breaking the F5 algorithm. In International Workshop on Information Hiding. Springer, Berlin, Heidelberg, pp 310–323. https://doi.org/10.1007/3-540-36415-3_20
Fridrich J, Goljan M, Hogea D, Soukal D (2003) Quantitative steganalysis of digital images: estimating the secret message length. Multimedia Syst 9:288–302. https://doi.org/10.1007/s00530-003-0100-9
Fridrich J, Tomas P, Kodovsky J (2007) Statistically undetectable JPEG steganography: Dead ends challenges, and opportunities. In: Proc of ACM Workshop on Multimedia and Security, pp 3–14. https://doi.org/10.1145/1288869.1288872
Garcia NPJ et al (2016) A hybrid PSO optimized SVM−based model for predicting a successful growth style of the Spirulina platensis from raceway experiments data. Elsevier J Computational App Math, pp 293–303
Gireeshan MG, Shankar DD, Azhakath AS (2021) Feature reduced blind steganalysis using DCT and spatial transform on JPEG images with and without cross validation using ensemble classifiers. J Ambient Intell Humanized Comput
Han MJ (2012) Data mining: concepts and techniques. Elsevier
Hofmann T, Scholkopt B, Smola AJ (2008) Kernel Methods in Machine Learning. Ann Stat 36(3):1171–1220
Hou X et al (2017) Combating highly imbalanced steganalysis with small training samples using feature selection. J Vis Commun Image Represent: 49. https://doi.org/10.1016/j.jvcir.2017.09.016
Huang Y, Zhao L (2018) Review on landslide susceptibility mapping using support vector machines. Catena 165:520–529
Hussain M, Wahab AWA, Idris YIB, Ho AT, Jung KH (2018) Image steganography in spatial domain: a survey. Signal Processi Image Commun 65:46-66
Jin Z, Feng G, Ren, Y, Zhang X (2020) Feature extraction optimization of JPEG steganalysis based on residual images. Signal Process 170:107455
Kang JS, You Y, Sung MY (2007) Steganography using block-based adaptive threshold. In: 2007 22nd international symposium on computer and information sciences. IEEE, pp 1–7
Kaur S, Bansal S, Bansal RK (2014) Steganography and classification of image steganography techniques. In: International Conference on Computing for Sustainable Global Development (INDIACom)
Ker AD (2007) Steganalysis of embedding in two least-significant bits. IEEE Trans Inform Forensics Secur 2(1):46–54
Ker AD, Bas P, Böhme R, Cogranne R, Craver S, Filler T, Pevný T (2013) Moving steganography and steganalysis from the laboratory into the real world. In: Proceedings of the first ACM workshop on Information hiding and multimedia security, pp 45–58
Kodovsky J, Penvy T, Fridrich J (2010) Modern steganalysis can detect YASS. Proceedings of SPIE - The Int Soc Opt Eng 7541:754102. https://doi.org/10.1117/12.838768
Kouziokas GN (2020) A new W-SVM kernel combining PSO-neural network transformed vector and Bayesian optimized SVM in GDP forecasting. Eng Appl Artif Intell 92:103650. https://doi.org/10.1016/j.engappai.2020.103650
Kuo BC, Ho HH, Li CH, Hung CC, Taur JS (2013) A kernel-based feature selection method for SVM with RBF kernel for hyperspectral image classification. IEEE J Sel Top Appl Earth Obs Remote Sens 7(1):317–326
Lanckriet GR, De Bie T, Cristianini N, Jordan MI, Noble WS (2004) A statistical framework for genomic data fusion. Bioinformatics 20(16):2626–2635. https://doi.org/10.1093/bioinformatics/bth294
Li X, Wang J (2007) A steganographic method based upon JPEG and particle swarm optimization algorithm. Inf Sci 177:3099–3109
Li Q et al (2013) Parallel multitask cross validation for Support Vector Machine using GPU. J Parallel Distrib Comput: 293–302
Luo J, He F, Yong J (2020) An efficient and robust bat algorithm with fusion of opposition-based learning and whale optimization algorithm. Intell Data Anal 24:581–606. https://doi.org/10.3233/IDA-194641
Lyu S, Farid H (2003) Detecting hidden messages using higher-order statistics and support vector machines. Inf Hiding: 2578
Ma XY (2018) Selection of rich model Steganalysis features based on decision rough set α-positive region reduction. IEEE transactions on circuits and Systems for Video Technology
Miche Y et al (2007) Extracting relevant features of steganographic schemes by feature selection techniques. Third Wavila Challenge, Saint-Malo, France
Mohammed HM, Umar SU, Rashid TA (2019) A systematic and meta-analysis survey of whale optimization algorithm. Comput Intell Neurosci:25. https://doi.org/10.1155/2019/8718571
Mudrov M (2005) Principal component analysis in image processing for image compression
Networks RJ (1995) Particle swarm optimization. In: Proceedings of IEEE international conference on neural Networks
Pevny T, Fridrich J (2007) Merging Markov and DCT features for multi-class JPEG steganalysis. In Security, steganography, and watermarking of multimedia contents IX. SPIE, vol 6505, pp 28–40.https://doi.org/10.1117/12.696774
Raj, S, Ray, KC (2017) ECG signal analysis using DCT-based DOST and PSO optimized SVM. IEEE Trans Instrume Meas: 1–9. https://doi.org/10.1109/TIM.2016.2642758
Rezk E, Awan Z, Islam F, Jaoua A, Al Maadeed S, Zhang N, Rajpoot N (2017) Conceptual data sampling for breast cancer histology image classification. Comput Biol Med 89:59–67
Sajedi H (2016) Steganalysis based on steganography pattern discovery. J Inf Secur Appl 30:3–14
Schaathun H (2012) Machine learning in image Steganalysis, John Wiley and Sons
Schaefer G, StitchM (2003) UCID: An uncompressed color image database. In: Storage and retrieval methods and applications for multimedia 2004. SPIE, vol 5307, pp 472–480.https://doi.org/10.1117/12.525375
Schmid HJMDC (2008) Hamming embedding and weak geometry consistency for large scale image search - extended version
Shankar DD (2020) Impact of features selected by principal component analysis in feature based steganalysis in calibrated and non-calibrated images. Int J Psychosoc Rehabil 6(24):4226–4243
Shankar DAA (2020) Minor blind feature based Steganalysis for calibrated JPEG images with cross validation and classification using SVM and SVM-PSO. Multimed Tools Appl:4073–4092
Shankar DD, Azhakath AS (2019) Steganalysis of minor embedded JPEG image in transform and spatial domain system using SVM-PSO. IEEE International Conference on Computational Intelligence and Knowledge Economy (ICCIKE), Dubai, United Arab Emirates
Shankar DD, Azhakath AS (2021) Small embed cross-validated jpeg steganalysis in spatial and transform domain using SVM. Advances in Machine Learning and Computational Intelligence. Algorithms for Intel systemsligent
Shinder L, Cross M (Eds) (2008) Chapter 12 - Understanding cybercrime prevention. In: Scene of the Cybercrime (2nd Edn) Syngress, pp 505–554. https://doi.org/10.1016/B978-1-59749-276-8.00012-1
Shlens J (2014) A tutorial on principal component analysis
Silva CCD (2017) Principal component analysis (PCA) as a statistical tool for identifying key indicators of nucear power plant cable insulation degradation. Iowa State University
Souza R (2010) Kernel functions for machine learning applications
Swain G (2016) Adaptive pixel value differencing steganography using both vertical and horizontal edges. Multimed Tools Appl 75:13541–13556. https://doi.org/10.1007/s11042-015-2937-2
Tseng HW, Leng HS (2013) A steganographic method based on pixel-value differencing and the perfect square number. J Appl Math 2013. https://doi.org/10.1155/2013/189706
Utkin LV, Chekh AI, Zhuk YA (2016) Binary classification SVM-based algorithms with interval-valued training data using triangular and Epanechnikov kernels. Neural Netw 80:53–66. https://doi.org/10.1016/j.neunet.2016.04.005
Vapnik VN (1999) An overview of statistical learning theory. IEEE Trans Neural Netw 10(5):988–999
Veena STAS (2018) Quantitative steganalysis of spatial LSB based stego images using reduced instances and features. Pattern Recogn Lett 105:39–49
Verma G, Verma H (2019) Predicting breast cancer using linear kernel support vector machine. In: Proceedings of 2nd International Conference on Advanced Computing and Software Engineering (ICACSE). https://doi.org/10.2139/ssrn.3350254
Villa A et al (2008) Gradient Optimization for multiple kernel’s parameters in support vector machines classification 4:224–227. https://doi.org/10.1109/IGARSS.2008.4779698
Westfeld A (2001) F5—a steganographic algorithm. In International workshop on information hiding. Springer, Berlin, Heidelberg, pp 289–302
Wu NI, Hwang MS (2017) A novel LSB data hiding scheme with the lowest distortion. Imaging Sci J 65(6):371–3788. https://doi.org/10.1080/13682199.2017.1355089
Yang C et al (2014) Evaluating unsupervised and supervised image classification methods for mapping cotton root rot. Precision Agriculture: 16. https://doi.org/10.1007/s11119
Yao XL, LG T, Dai GC (2008) Landslide susceptibility mapping based on Support Vector Machine: A case study on natural slopes of Hong Kong, China. Geomorphology 101:572–582. https://doi.org/10.1016/j.geomorph.2008.02.011
Yedroudj M (2019) Steganalysis and steganography by deep learning (Doctoral dissertation, Montpellier University)
Yi-Fei T, Tan W-N, Guo X (2013) Integrated lossy and lossless compression with lsb insertion technique in steganography. Proc SPIE: 8878. https://doi.org/10.1117/12.2031061
Yu L et al (2010) Improved adaptive LSB steganography based on chaos and genetic algorithm. EURASIP J Adv Signal Process: 876946. https://doi.org/10.1155/2010/876946
Zhang J, Cox IJ, Doerr G (2007) Steganalysis for LSB matching in images with high-frequency noise. IEEE 9th Workshop on Multimedia Signal Processing
Zhang Y-D et al (2016) facial emotion recognition based on biorthogonal wavelet entropy, fuzzy support vector machine, and stratified cross validation. IEEE
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethics approval
The paper follows the journal ethics.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Shankar, D.D., Khalil, N. & Azhakath, A.S. Moderate embed cross validated and feature reduced Steganalysis using principal component analysis in spatial and transform domain with Support Vector Machine and Support Vector Machine-Particle Swarm Optimization. Multimed Tools Appl (2022). https://doi.org/10.1007/s11042-022-13638-w
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11042-022-13638-w