Skip to main content

Digital Forensic Source Camera Identification with Efficient Feature Selection Using Filter, Wrapper and Hybrid Approaches

  • Conference paper
  • First Online:
Information Systems Security (ICISS 2016)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 10063))

Included in the following conference series:

Abstract

Digital Forensics is the branch of science dealing with investigation of evidences recovered from digital devices, to safeguard against rapidly increasing cyber crimes in today’s digital world. The Source Camera Identification (SCI) problem is to map an image under question correctly to its source device. Following a Digital Forensic approach, the source of an image is detected by post–priori investigation of traces left behind in the image, by the camera. Such traces are generated due to the post–processing operations an image undergoes inside a digital camera, after being captured. In this paper, we model the SCI problem as a machine learning classification problem and focus on the most crucial component of a learning model, i.e. feature selection. We propose three different techniques for feature selection: Filter based approach, Wrapper based approach using Genetic Algorithm (GA), and also a hybrid approach with both Filter and Wrapper methods combined together. We investigate the source detection accuracy that each technique succeeds to achieve. Our experimental results suggest that the proposed methods produced a much compact feature set, hence considerably improve the source detection accuracy and minimize the training time of the learning model, as compared to the state–of–the–art.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Celiktutan, O., Sankur, B., Avcibas, I.: Blind identification of source cell-phone model. IEEE Trans. Inf. Forensics Secur. 3(3), 553–566 (2008)

    Article  Google Scholar 

  2. Bayram, S., Sencar, H.T., Memon, N.: Improvements on source camera-model identification based on CFA interpolation. In: Proceeding of WG (2006)

    Google Scholar 

  3. Kharrazi, M., Sencar, H.T., Memon, N.: Blind source camera identification. In: International Conference on Image Processing (ICIP) (2004)

    Google Scholar 

  4. Tsai, M.-J.: Adaptive feature selection for digital camera source identification. In: IEEE International Symposium on Circuits, Systems, pp. 412–415 (2008)

    Google Scholar 

  5. Tsai, M.-J.: A Hybrid model for digital camera source identification. IEEE International Conference on Image Processing (ICIP), pp. 2901–2904 (2009)

    Google Scholar 

  6. Lukas, J.: Digital camera identification from sensor pattern noise. IEEE Trans. Inf. Forensics Secur. 1(2), 205–214 (2006)

    Article  MathSciNet  Google Scholar 

  7. Li, C.-T.: Digital camera identification from sensor pattern noise. IEEE Trans. Inf. Forensics Secur. 5(2), 280–287 (2010)

    Article  Google Scholar 

  8. Lin, X., Li, C.-T.: Preprocessing reference sensor pattern noise via spectrum equalization. IEEE Trans. Inf. Forensics Secur. 11(1), 126–140 (2016)

    Article  Google Scholar 

  9. Biney, A.G., Sellahewa, H.: Analysis of smartphone model identification using digital images. In: International Conference on Image Processing (ICIP) (2013)

    Google Scholar 

  10. Bayram, S., Avcibas, I., Sankur, B., Memon, N.: Image manipulation detection. J. Electronic Imaging 15(4), 041102 (2006). International Society for Optics and Photonics

    Article  Google Scholar 

  11. Avcibas, I., Sankur, B., Memon, N.: Image steganalysis with binary similarity measures. In: International Conference on Image Processing (ICIP), vol. 3 (2002)

    Google Scholar 

  12. Avcibas, I., Memon, N., Sankur, B.: Steganalysis using image quality metrics. IEEE Trans. Image Process. 12(2), 221–229 (2003)

    Article  MathSciNet  Google Scholar 

  13. Lyu, S., Farid, H.: Steganalysis using higher-order image statistics. IEEE Trans. Inf. Forensics Secur. 1(1), 111–119 (2006)

    Article  Google Scholar 

  14. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)

    Google Scholar 

  15. Schaffernicht, E., Gross, H.M.: Weighted mutual information for feature selection. In: International Conference on Artificial Neural Networks (2011)

    Google Scholar 

  16. Van Hulse, J., Khoshgoftaar, T.M., Napolitano, A., Wald, R.: Threshold-based feature selection techniques for high-dimensional bioinformatics data. Network Modeling Anal. Health Inform. Bioinform. 1(1), 47–61 (2012)

    Article  Google Scholar 

  17. Liu, D., Cho, S.Y., Sun, D.M., Qiu, Z.D.: A spearman correlation coefficient ranking for matching-score fusion on speaker recognition. In: TENCON (2010)

    Google Scholar 

  18. Yuan, C., Sun, D., Liu, D., Cho, S. Y., Zhang, Y.: A research on feature selection and fusion in palmprint recognition. In: International Workshop on Emerging Techniques and Challenges for Hand-Based Biometrics (ETCHB) (2010)

    Google Scholar 

  19. Onpans, J., Rasmequan, S., Jantarakongkul, B., Chinnasarn, K., Rodtook, A.: Intrusion feature selection using mmodified heuristic greedy algorithm of itemset. In: International Symposium on Communications and Information Technologies (ISCIT) (2013)

    Google Scholar 

  20. Rachburee, N., Punlumjeak, W.: A comparision of feature selection approach between Greedy, IG-ratio, Chi-square, and mRMR in educational mining. In: International Conference on Information Technology and Electrical Engineering (ICITEE) (2015)

    Google Scholar 

  21. Bhasin, V., Bedi, P., Singhal, A.: Feature selection for steganalysis based on modified stochastic diffusion search using fisher score. In: International Conference on Advances in Computing, Communications and Informatics (ICACCI), September 2014

    Google Scholar 

  22. Singh, B., Sankhwar, J.S., Vyas, O.P.: Optimization of feature selection method for high dimensional data using fisher score and minimum spanning tree. In: INDICON, December 2014

    Google Scholar 

  23. Xu, J., Yin, Y., Man, H., He, H.: Feature selection based on sparse imputation. In: International Joint Conference on Neural Networks (IJCNN), June 2012

    Google Scholar 

  24. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1), 273–324 (1997)

    Article  MATH  Google Scholar 

  25. Chen, Y.-H., Lin, T.-C.: Dimension reduction techniques for accessing chinese readability. In: International Conference on Machine Learning and Cybernetics, July 2014

    Google Scholar 

  26. Packianather, M.S., kapoor, B.: A wrapper-based feature selection approach using bees algorithm for a wood defect classification system. In: System of Systems Engineering Conference (2015)

    Google Scholar 

  27. Yu, E., Cho, S.: GA-SVM wrapper approach for feature subset selection in keystroke dynamics identity verification. In: Proceedings of the International Joint Conference on Neural Networks (2003)

    Google Scholar 

  28. Talukder, K.H., Harada, K.: Haar wavelet based approach for image compression and quality assessment of compressed image. Int. J. Appl. Math. 36(1) (2007)

    Google Scholar 

  29. Gunawan, I.P., Halim, A.: Haar wavelet decomposition based blockiness detector and picture quality assessment method for JPEG images. In: International Conference on Advanced Computer Science and Information System (2011)

    Google Scholar 

  30. Gloe, T., Bhme, R.: Dresden image database’ for benchmarking digital image forensics. In: IEEE International Conference on Acoustics, Speech and Signal Processing (2007)

    Google Scholar 

  31. Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27:1–27:27 (2011)

    Article  Google Scholar 

  32. Ng, A.: “CS229 Lecture Notes”, CS229 Lecture notes, Stanford (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Venkata Udaya Sameer .

Editor information

Editors and Affiliations

A Appendix: Statistical Measures Used as Feature Filters

A Appendix: Statistical Measures Used as Feature Filters

  • The Chi Squared is a statistical method that measures independence of two variables. In feature selection, chi-square used to check whether the class variable is independent of a feature. Consider \(O_ij\) is the observed frequency and \(E_ij\) is the expected frequency, then chi-squared [19, 20] is defined as

    $$\begin{aligned} \chi ^2 = \sum \frac{(O_{ij} - E_{ij})^2}{E_{ij}} \end{aligned}$$
    (5)
    $$\begin{aligned} E_{ij} = \frac{(R_{T_i})(C_{T_j})}{N} \end{aligned}$$
    (6)

    where \(R_{T_i}\) is number of samples in the ith value, \(C_{T_j}\) is number of samples in the class j, N is total number of samples.

  • The Mutual Information [15] method measures the dependency of a variable towards reducing the uncertainty about the target variable (class). It maximizes the mutual information between joint distribution and target class variables in the datasets with many features.

  • The Fisher Score measures the variance between the expected value of the information and the observed value. The information is maximized when variance is minimized. Consider dataset with c classes, \(n_j\) samples for class j, \(\mu _j\) mean value of class j, \(\mu \) mean value of whole class and \(\sigma _j^2\) variance of class j. Then fisher score [21,22,23] \(S_k\) for feature \(F_k\) is defined as

    $$\begin{aligned} S_k = \frac{\sum _{j=1}^{c}n_j(\mu _j-\mu )^2}{\sum _{j=1}^{k}n_j\sigma _j^2} \end{aligned}$$
    (7)
  • The Pearson Correlation Coefficient is a statistical model which finds the strength of the correlation between two variables. It is computed by covariance of two variables dividing by the product of their standard deviations. The Pearson correlation coefficient [14] is defined as

    $$\begin{aligned} R = \frac{cov(X,Y)}{ \sqrt{var(X) var(Y)}} \end{aligned}$$
    (8)

    where cov denotes the covariance and var the variance. Therefore,

    $$\begin{aligned} R = \frac{\sum _{k=1}^{m}(x_k-\bar{x})(y_k-\bar{y})}{\sqrt{\sum _{k=1}^{m}(x_k-\bar{x})^{2} \sum _{k=1}^{m}(y_k-\bar{y})^{2}}} \end{aligned}$$
    (9)
  • The Kendall’s Tau rank correlation [16] is a statistical measure which measures the degree of similarity between the ranking of two variables. Consider n number of samples, \(n_c\) number of concordant (ordered in the same way) and \(n_d\) number of discordant (ordered differently). The kendall’s Tau is defined as

    $$\begin{aligned} \tau = \frac{n_c-n_d}{\frac{n(n-1)}{2}} \end{aligned}$$
    (10)
  • The Spearman Correlation is a statistical measure expresses the degree of how two variables are monotonically related. Consider we have n samples and \(x_i\) is sample values of X and \(r(x_i)\) is the rank of \(x_i\) and \(y_i\) is values of Y (class) and \(r(y_i)\) is the rank of \(y_i\). The Spearman coefficient [17, 18] is calculated as

    $$\begin{aligned} s(X,Y) = 1-\frac{6\sum _{i=1}^{n}(r(x_i)-r(y_i))^2}{n(n^2-1)} \end{aligned}$$
    (11)

    The above filters are applied in this paper on a feature set of 598 features, as discussed in Sect. 3.2. The Tables 1 and 2 show the performance of the above filters with respect to accuracy and F–Score.

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Sameer, V.U., Sugumaran, S., Naskar, R. (2016). Digital Forensic Source Camera Identification with Efficient Feature Selection Using Filter, Wrapper and Hybrid Approaches. In: Ray, I., Gaur, M., Conti, M., Sanghi, D., Kamakoti, V. (eds) Information Systems Security. ICISS 2016. Lecture Notes in Computer Science(), vol 10063. Springer, Cham. https://doi.org/10.1007/978-3-319-49806-5_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49806-5_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49805-8

  • Online ISBN: 978-3-319-49806-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics