1 Introduction

In recent years, due to immediacy and the easily understandable image content, images have become the prime source of information exchange. It is being used as evidence in legal matters, proof of an experiment, media, real-world events etc. At the same time, availability of sophisticated image manipulation software and pervasive imaging devices gave rise to the need for forensic toolboxes which can access authenticity of images without knowing the original source information. Hence, numerous forensic methods are proposed which focuses on detection of such malicious post-processing of images. On the basis of method used for manipulation of an image, image forgery is divided into three categories: copy-move, image splicing, and image resampling. Copy-move forgery is also called as image cloning, where a sub-part of an image is copied and pasted to the other part of the same image, to hide important information, whereas image splicing uses cut and paste technique, in which part of one or more images are pasted to different or same image. Image resampling is done on an image by geometric transformation like scaling, stretching, rotation, skewing, flipping. Figure 1 depicts few example of image manipulation which is downloaded from Internet.

Fig. 1
figure 1

Examples of image forgery a copy-move, b image splicing, c image resampling

In the course of this paper, we shall focus on image splicing, which is one of the most used techniques for image tempering. It involves combining or composition of two or more images to produce a forged image. Splicing detection uses passive approach where no prior information of image is known. In recent years, researchers have proposed several methods on image splicing forgery detection. Shi et al. [1] proposed a image model, which reduce statistical moments by treating the neighboring differences of block discrete cosine transform of an image as 1D signal and the dependencies between neighboring nodes along certain directions have sculpted as Markov model. These features are considered as discriminative feature for support vector machine classifier. Xuefang Li et al. [2] proposed an approach based on Hilbert–Huang transform (HHT) and moment of wavelet transform characteristic function. They used SVM as a classifier for spliced image classification with an accuracy of 85.86%.

Method proposed by Zhao et al. [3] is based on gray level run length (RLRN) feature and chroma channel. Features are extracted using gray level RLRN vectors along four different directions from decorrelated chroma channel. Extracted features are introduced to SVM for classification.

Pevny et al. [4] proposed a method which is based on SPAM feature and modeled it as second-order Markov matrix along certain directions, which is treated as discriminative feature for SVM classifier. Later, Kirchner and Fridrich [5] used SPAM and extended it to detect median filter of JPEG compressed image which is supportive for image tempering detection. Other than above proposed method Markov model-based approach which utilizes local transition feature has shown promising splicing detection accuracy. He et al. [6] introduced Markov model in DCT domain as well as DWT domain. The difference coefficient array and transition probability matrix are modeled as feature vector and cross-domain Markov feature are considered as discriminative feature for SVM classifier. However, the proposed approach requires up to 7290 features. An enhanced state selection method is proposed by B. Su et al. [7]. In this approach, author considers some already proposed function model and maps the large number of coefficients extracted from transform domain to specific states. However, by reducing the number of features, this method sacrifices the detection performance. X. Zhao et al. [8] proposed a model in which an image is modeled as a 2D non-casual signal and captures the dependencies between the current node and its neighbors. This model is applied on BDCT and DMWT, and combined extracted features are used for classification. It is found that their method has better detection rate with the cost of higher dimension of 14,240.

As per the above discussion, it is concluded that Markov model-based approach suffers from information loss and higher feature dimension, which is directly proportional to the threshold election. Larger threshold value can minimize information loss, but it will increase the feature dimension too, which can create overfitting problem, and detection capability will get reduced. Therefore, the choice of threshold becomes a trade-off between the detection performance and computational cost. In this paper, an enhanced threshold method is proposed which gives much lesser dimension of features even with large threshold value, which improves the computational cost as well as the detection rate as discussed in step 3 in proposed work.

The rest of the paper is organized as follows. Section 2 shows proposed work algorithm framework. The experimental results and the comparison with other methods are depicted in Sect. 3, followed by conclusion in Sect. 4.

2 Proposed Method

In this paper, we proposed a model in which features are extracted from discrete cosine transform (DCT) and discrete Meyer wavelet transform (DMWT) domain and an enhanced threshold method is used to reduce the information loss as well as the computational cost, which results in improved detection capability. After all the related features are generated, SVM is used as classifier to distinguish the authentic and spliced image. The proposed algorithm framework is shown in Fig. 2.

Fig. 2
figure 2

Flow diagram of proposed model

2.1 Algorithm Flow

  • Divide the input image into \(8 \times 8\) non-overlapping blocks.

  • DCT is applied on each sub-block.

  • Round the coefficient and difference array is obtained in horizontal and vertical direction.

  • Enhanced threshold method is applied to calculate Markov matrix.

  • Above process is applied in DMWT domain also; considering dependency among wavelet coefficient, more Markov features are extracted.

  • Combine all the features, extracted from DCT and DWT domain.

  • SVM classifier is used to distinguish authentic and spliced image.

2.2 Extracting Splicing Artifacts

Feature Extraction in DCT Domain: The Markov feature in DCT and DWT is proposed in [6], in which correlation of neighboring coefficients is considered to differentiate authentic and spliced images. The process involved in calculation of difference arrays followed by transition probability matrix. Threshold value T introduced in [6] is to minimize the computational cost, which achieved a feature dimension of \((2\mathrm{T}+1) \times (2\mathrm{T}+1) \times 4\), but it is still on higher side. To minimize the dimension of feature vector and limit the overfitting problem, we introduced an enhanced threshold method which achieves much lesser feature dimension of \((\mathrm{T}+1) \times (\mathrm{T}+1) \times 4\). Proposed approach is explained in step 3 of this section. Markov features in DCT domain are computed as follows:

Step 1: In the first step, DCT coefficient is obtained by applying non-overlapping \(8\times 8\) block discrete cosine transform (BDCT) on the input image and denoted as S. We used BDCT in our proposed model due to its energy compaction and decorrelation capability.

Step 2: In the second step, round the DCT coefficient to the nearest integer value. Then, horizontal \((F_{h})\) and vertical \((F_{v})\) difference array is calculated using the following equations:

$$\begin{aligned} F_{h}\left( i,j \right) =S\left( i,j \right) -S\left( i+1,j \right) \end{aligned}$$
(1)
$$\begin{aligned} F_{v}\left( i,j \right) =S\left( i,j \right) -S\left( i,j+1 \right) \end{aligned}$$
(2)

where \(i \in \left[ 1,S_{m}-1 \right] , j \in \left[ 1,S_{n}-1 \right] \), and \(S_{m}\) and \(S_{n}\) is the dimension of input source image.

Step 3: Enhanced threshold method: Considering threshold T\(\left( T \in N_{+} \right) \), it is replaced with T or −T, if the value of an element in difference array is either >T or \({<}-\)T, respectively, and the range of threshold we considered is \((u,v)\in \left\{ -T,-T+2, \ldots ,T+2, T \right\} \). Under given range, we calculate the horizontal and vertical Markov matrices using Eqs. (3), (4), (5), (6), which minimize the feature dimension to \(4\times (\mathrm{T}+1) \times (\mathrm{T}+1)\).

$$\begin{aligned} \begin{aligned}&p\left\{ F _{h}\left( i+1,j \right) =v\mid F_{h}\left( i,j \right) =u \right\} =\\&\frac{\sum _{i=1}^{S_{m}-2}\sum _{j=1}^{S_{n}-1}\delta \left( \left( F _{h}\left( i,j \right) =u \parallel F_{h}\left( i,j \right) =u-1 \right) , \left( F _{h}\left( i+1,j \right) =v \parallel F_{h}\left( i+1,j \right) =v-1 \right) \right) }{\sum _{j=1}^{S_{n}-2}\sum _{i=1}^{S_{m}-1}\left( \delta \left( F_{h}\left( i,j \right) =u \right) \parallel \delta \left( F_{h}\left( i,j \right) =u-1 \right) \right) } \end{aligned} \end{aligned}$$
(3)
$$\begin{aligned} \begin{aligned}&p\left\{ F _{h}\left( i,j+1 \right) =v\mid F_{h}\left( i,j \right) =u \right\} =\\&\frac{\sum _{i=1}^{S_{m}-1}\sum _{j=1}^{S_{n}-2}\delta \left( \left( F _{h}\left( i,j \right) =u \parallel F_{h}\left( i,j \right) =u-1 \right) , \left( F _{h}\left( i,j+1 \right) =v \parallel F_{h}\left( i,j+1 \right) =v-1 \right) \right) }{\sum _{j=1}^{S_{n}-1}\sum _{i=1}^{S_{m}-2}\left( \delta \left( F_{h}\left( i,j \right) =u \right) \parallel \delta \left( F_{h}\left( i,j \right) =u-1 \right) \right) } \end{aligned} \end{aligned}$$
(4)
$$\begin{aligned} \begin{aligned}&p\left\{ F _{v}\left( i+1,j \right) =v\mid F_{v}\left( i,j \right) =u\right\} =\\&\frac{\sum _{i=1}^{S_{m}-2}\sum _{j=1}^{S_{n}-1}\delta \left( \left( F _{v}\left( i,j \right) =u \parallel F_{v}\left( i,j \right) =u-1 \right) , \left( F _{v}\left( i+1,j \right) =v \parallel F_{v}\left( i+1,j \right) =v-1 \right) \right) }{\sum _{j=1}^{S_{n}-2}\sum _{i=1}^{S_{m}-1}\left( \delta \left( F_{v}\left( i,j \right) =u \right) \parallel \delta \left( F_{v}\left( i,j \right) =u-1 \right) \right) } \end{aligned} \end{aligned}$$
(5)
$$\begin{aligned} \begin{aligned}&p\left\{ F _{v}\left( i,j+1 \right) =v\mid F_{v}\left( i,j \right) =u \right\} =\\&\frac{\sum _{i=1}^{S_{m}-1}\sum _{j=1}^{S_{n}-2}\delta \left( \left( F _{v}\left( i,j \right) =u \parallel F_{v}\left( i,j \right) =u-1 \right) , \left( F _{v}\left( i,j+1 \right) =v \parallel F_{v}\left( i,j+1 \right) =v-1 \right) \right) }{\sum _{j=1}^{S_{n}-1}\sum _{i=1}^{S_{m}-2}\left( \delta \left( F_{v}\left( i,j \right) =u \right) \parallel \delta \left( F_{v}\left( i,j \right) =u-1 \right) \right) } \end{aligned} \end{aligned}$$
(6)

where (u, v) \(\in \left\{ -T,-T+2,-T+4, \ldots T-4,T-2,T \right. \left. \right\} \), and S\(_m\) and S\(_n\) denote the dimension of original source image and

$$\begin{aligned} \delta \left( A=u,B=v \right) ={\left\{ \begin{array}{ll} 1 &{} \text { if } A=u,B=v. \\ 0 &{} \text { otherwise } \end{array}\right. } \end{aligned}$$
(7)

Finally, all the captured elements of the Markov matrix can be used as features for image splicing detection.

Similarly, inter-block correlation is considered to extract more Markov features. Here, inter-block difference 2D array is calculated using Eqs. (8) and (9).

$$\begin{aligned} E_{h}\left( i,j \right) =S\left( i,j \right) -S\left( i+8,j \right) \end{aligned}$$
(8)
$$\begin{aligned} E_{v}\left( i,j \right) =S\left( i,j \right) -S\left( i,j+8 \right) \end{aligned}$$
(9)

where \(i \in \left[ 1,S_{m}-1 \right] ,j \in \left[ 1,S_{n}-1 \right] \) and, \(S_{m}\) and \(S_{n}\) is the dimension of original input image.

Now, enhanced threshold method is applied to the inter-block difference array \(E_{h}(i,j)\) and \(E_{h}(i,j)\) as explained in step 3 where \(S_{m}-1\), \(S_{m}-2\), \(S_{n}-1\), and \(S_{n}-2\) are replaced with \(S_{m}-8\), \(S_{m}-16\), \(S_{n}-8\), and \(S_{n}-16\), respectively. Hence, by considering inter-block correlation \(4\times (T+1) \times (T+1)\), more features have been extracted. Thus, a total of \(2 \times 4 \times (T+1) \times (T+1)\) features are extracted from DCT domain which can be used to distinguish the authentic image from spliced one.

Feature Extraction in DMWT Domain: Most of the previously proposed approach based on DWT [9, 10] deals with all the sub-bands independently after wavelet decomposition, but [6] shows that there is dependency among wavelet components across position, scales, and orientation. However, it is observed that among the three dependencies contribution of position and orientation is more than scale in splicing detection. So, in this paper, we only consider dependency across position and orientation. Hence, Markov features with different dependencies are extracted as follows.

Step 1: We apply two-level discrete Meyer wavelet transform on the input image and round the coefficient of eight sub-bands to absolute value. Processed sub-bands are denoted as \(\left\{ W_{a}^{b},W_{h}^{b},W_{v}^{b},W_{d}^{b} \right\} \), where \(\text {b}=\left\{ 1,2 \right\} \).

Step 2: Consider dependency across position in DMWT domain, which is similar to characterize correlation between neighboring coefficients in DCT domain. Hence, by replacing F in Eqs. (1) and (2) with each of the eight sub-bands of DMWT domain followed by using Eqs. (3), (4), (5), and (6), we captured a total of \((T+1)\times (T+1) \times 32\) more Markov features.

Step 3: Now, considering the dependency among orientation, more features can be extracted using the following difference arrays.

$$\begin{aligned} W_{h}W_{v}^b\left( I,J \right) =W_{h}^{b}\left( i,j \right) -W_{v}^{b}\left( i,j \right) \end{aligned}$$
(10)
$$\begin{aligned} W_{v}W_{d}^b\left( I,J \right) =W_{v}^{b}\left( i,j \right) -W_{d}^{b}\left( i,j \right) \end{aligned}$$
(11)
$$\begin{aligned} W_{d}W_{h}^b\left( I,J \right) =W_{d}^{b}\left( i,j \right) -W_{h}^{b}\left( i,j \right) \end{aligned}$$
(12)

where b = \(\left\{ 1,2 \right\} \) and \( W_{a}^{b}, W_{h}^{b}, W_{v}^{b}, W_{d}^{b}\) denote bth level approximation, horizontal, vertical, and diagonal sub-bands, respectively.

Now, \(F_{h}\) in Eqs. (3) and (4) is replaced by each of the difference arrays obtained in (10), (11), and (12) to capture more Markov matrix. Hence, \((T+1)\times (T+1)\times 12\) more Markov features are obtained.

By combining \((T+1)\times (T+1)\times 8\) Markov features captured in DCT domain and \((T+1)\times (T+1)\times 44\) Markov features captured in DMWT domain, resultant feature vector is used to differentiate spliced image from an authentic one. We choose threshold T = 6. So, we got a total of 2548 features.

3 Experimental Results and Performance Analysis

3.1 Dataset and Classifier

We use Columbia image dataset [11] provided by DVMM. It consists of 933 authentic and 912 spliced images without any post-processing enhancement. All the forged images are spliced image. This dataset is designed to test the blind image splicing detection method. Some images from the DVMM dataset are shown in Fig. 3, in which first row shows the set of authentic images and second row shows the set of spliced images.

Fig. 3
figure 3

Some sample of authentic and spliced images from Columbia image splicing evaluation dataset [11]

To classify the images, support vector machine (SVM) is used in our experiments. In this experiment, SVM classifier is trained to solve the binary decision problem (classification of authentic and spliced images).

To evaluate the performance, all the experiments are performed on Columbia image splicing dataset [11] using same classifier. In each experiment, 80% randomly selected images are used to train the SVM classifier and remaining 20% images are used for testing.

3.2 Performance Analysis of the Proposed Model

Some experiments are carried out to verify and compare the detection accuracy of the proposed approach. T is set to 6 in these experiments. Feature vectors from DCT and DMWT domain are captured and effect on the detection performance of the proposed method with Z. He et al. [6] is evaluated in both the domain. The obtained results are shown in Table 1 and Table 2, respectively. In Table 2, level 1 and level 2 represent the first-level DMWT and second-level DMWT, respectively. It can be observed from Table 1 and Table 2 that our method has improved the detection rate by approximately 1.0%–3.1% and 2.1%–2.3 % in BDCT and DMWT domains, respectively. Further, it is observed that by combining feature vectors from DCT and DMWT domains, we are getting much better accuracy.

Table 1 Comparison and detection rate of original and proposed method in BDCT domain with T = 6
Table 2 Comparison and detection rate of original and proposed method in DMWT domain with T = 6

Table 3 shows the comparison and detection rate of proposed work and some previous splicing detection methods [7, 12]. The complete implementation of the proposed method has achieved an accuracy of 88.17%, which makes a significant progress in splicing detection. In Table 3, true positive (TP) and true negative (TN) are calculated as:

$$\begin{aligned} TP=\frac{N_{ca}}{N_{a}},\quad TN=\frac{N_{cs}}{N_{s}} \end{aligned}$$
(13)

where \(N_{ca}\) = number of correct authentic classification, \(N_{cs}\) = number of correct spliced classification, \(N_{a}\) = total number of authentic images, and \(N_{s}\) = total number of spliced images.

Table 3 Comparison of proposed approach and existing methods

The experimental results of proposed and other methods are shown in Table 3. It can be observed that our proposed method performs best out of the three presented splicing detection scheme in Table 3.

3.3 Recognizing Real Images

In Fig. 4, publicly available on Internet, we have given three original images (b), (c), and (e) and their associated altered images (a) and (d). To test these five images (three authentic and two spliced), we trained the classifier using experiments mentioned in Sect. 3.1. The test has been performed 20 times, and the results are shown in Table 4. It can be observed that there are only four cases in which images are wrongly classified.

Fig. 4
figure 4

Test images. a Spliced image. b Authentic image. c Authentic image. d Altered image. e Authentic image

Table 4 Splicing detection on real images (\(\checkmark \) correct, \(\times \) wrong)

3.4 Threshold Selection

Selecting a threshold is an issue because in general for a smaller T value, information loss will be higher; in that case, Markov matrices may be insufficient to distinguish authentic and forged images, whereas a larger T value can reduce information loss but a larger number of features can generate an overfitting problem, which results in low detection performance. Therefore, the choice of T and size of Markov matrix have an important impact on detection performance and computational cost.

Table 5 Performance analysis for different thresholds

The performance analysis of proposed approach for different thresholds \(\left( T = 4, 6,\,\mathrm{and}\,8\right) \) is shown in Table 5. From Table 5, it can be observed that T = 6 is the best choice which balances the detection rate and computational cost with the accuracy of 88.43%.

4 Conclusion

In this paper, an enhanced threshold method is proposed to extract Markov feature which generates reduced feature set without any feature loss which improves the detection rate. Reduced feature sets are extracted from DCT and DMWT domains by performing difference operation followed by enhanced threshold method. Features extracted from DCT domain consider the correlation between the DCT coefficients, while DMWT domain distinguishes the dependency among coefficients across orientations and positions. Finally, the combined reduced feature vector from both the domain is considered as distinguished feature for classification. SVM is used as a classifier in our experiments. Our experimental results are encouraging, yielding the accuracy of over 88.43% correct classification which outperforms some state-of-the-art methods.