Coherence of PRNU weighted estimations for improved source camera identification

Bruni, Vittoria; Tartaglione, Michela; Vitulano, Domenico

doi:10.1007/s11042-020-10477-5

Coherence of PRNU weighted estimations for improved source camera identification

1169: Interdisciplinary Forensics: Government, Academia and Industry Interaction
Open access
Published: 02 February 2021

Volume 81, pages 22653–22676, (2022)
Cite this article

Download PDF

You have full access to this open access article

Multimedia Tools and Applications Aims and scope Submit manuscript

Coherence of PRNU weighted estimations for improved source camera identification

Download PDF

Vittoria Bruni ORCID: orcid.org/0000-0003-3909-7463¹,
Michela Tartaglione¹ &
Domenico Vitulano^1,2

1586 Accesses
1 Altmetric
Explore all metrics

Abstract

This paper presents a method for Photo Response Non Uniformity (PRNU) pattern noise based camera identification. It takes advantage of the coherence between different PRNU estimations restricted to specific image regions. The main idea is based on the following observations: different methods can be used for estimating PRNU contribution in a given image; the estimation has not the same accuracy in the whole image as a more faithful estimation is expected from flat regions. Hence, two different estimations of the reference PRNU have been considered in the classification procedure, and the coherence of the similarity metric between them, when evaluated in three different image regions, is used as classification feature. More coherence is expected in case of matching, i.e. the image has been acquired by the analysed device, than in the opposite case, where similarity metric is almost noisy and then unpredictable. Presented results show that the proposed approach provides comparable and often better classification results of some state of the art methods, showing to be robust to lack of flat field (FF) images availability, devices of the same brand or model, uploading/downloading from social networks.

Complexity Based Sample Selection for Camera Source Identification

Source Camera Identification Based on Guided Image Estimation and Block Weighted Average

Joint Correlation Measurements for PRNU-Based Source Identification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The intensive use of images and more in general multimedia content makes them crucial for investigation purposes but, on the other hand, makes investigations difficult from different points of view: large amount of data to process and analyse; authenticity assessment of image content; identification of the source generating the image, robustness to image manipulation and modification due to transmission, storage, posting on social networks and so on. Photo Response Non Uniformity pattern noise (PRNU in the sequel) is a camera fingerprint that can be used in different digital forensics procedures, as it allows to establish a correspondence between a device and the images that have been acquired by it [2, 20, 24]. This kind of information can result useful, for example, in source camera identification, source camera verification, image integrity and manipulation assessment [10, 11, 21, 29].

Specifically, PRNU represents a noise component in the image that is caused by sensors imperfections. Due to its specific characteristic and its ability in uniquely identifying the device, there has been a great interest in this kind of feature for addressing some critical issues in the image forensic field. Unfortunately, due to its nature, i.e. noise component hidden in the image content, its extraction is not trivial. Several methods have been proposed in the literature in the last years and they mainly depend on the application purpose. In fact, the working scenario can often change, requiring different operative procedures to get the final result with the expected/required accuracy [8, 12, 14, 21, 25, 36, 37]. The most common working scenarios are:

1.
given one or more images acquired by an unknown device and a set of devices, establish which of the avaliable devices took the images;
2.
given one or more images acquired by an unknown device and a set of devices, establish if one of the available devices took the images;
3.
given a set of images, group them according to the corresponding source.

In some “relaxed” scenarios, it could be enough to identify the brand of the source device.

Source camera identification/verification methods are mainly based on two main phases: i) PRNU extraction and ii) PRNU classification. The literature concerning this topic focused on one of the two phases or both in dependence on the final task. In the first case denoising procedures are applied as PRNU is considered a noise component whose contribution can be found in a proper residual image [22, 23, 27, 34]. In the second case classification methods, mainly based on clustering techniques, are used [17, 19, 25]. In both cases, adopted methods are required to be robust to PRNU extraction procedures as well as data coming from different sources (notebook, social network [8], smartphone [37], etc.). As a matter of fact, in forensic investigations very few data often are available. This makes source camera identification problem further delicate. The pioneering work in this field is the one in [24], where a model for the acquired image has been proposed and discussed. Using this model, the extraction algorithm has then been derived by applying a proper denoising filter; classification is performed using the normalized correlation as similarity metric and a proper statistical study is conducted for estimating device-based thresholds to use in the final identification—see Fig. 1. With regard to the denoising step, several denoisers have been proposed, including methods dealing with multiplicative noise [28]; similarly, several similarity metrics have been introduced for the classification step even though correlation-based metrics have shown to be somewhat effective for this purpose. That is why normalized cross-correlation (NCC) is usually employed, even though many papers show the better performance provided by peak-to-correlation energy (PCE) [16, 38]. Interesting reviews concerning the actual state of the art can be found in [1, 2, 23, 31, 35]. As mentioned above, the main problem in PRNU extraction is the denoising method. In fact, residual edges (i.e., structured information that is not preserved in the denoised image) in the residual image (i.e., the difference between the noisy and the denoised image) can contribute to mis-classifications or can alter the classification procedure. That is why some approaches try to estimate PRNU just on flat regions, while some other apply proper weighting in the denoising process that distinguishes between edges/textures and smooth regions [33, 39, 40]. The latter approaches show a considerable improvement in PRNU extraction even though they make it computationally intensive; this is the reason why faster denoising procedures are preferable. A different strategy has been proposed in [32] and [6]. Instead of refining and improving PRNU extraction procedures, the contribution of each pixel in the similarity metric is weighted according to its probability to be corrupted by sources different from the original PRNU. As in [32], in this paper the role of edges and textures in denoising process is taken into account without neglecting the role of enhancement processing that is applied whenever the reference PRNU for a given device is extracted from a set of images taken from it. This paper is then an extension and a generalization of the one in [6], where the initial idea and preliminary results have been presented. Specifically, the first two working scenarios previously mentioned have been considered, and two feature vectors are defined for each candidate image. Each feature vector refers to a different averaging-based method for device PRNU estimation and it is composed of three correlation values: the one computed using the whole PRNU image; the one computed using only the region containing edges; the one computed using only flat regions. Due to the random nature of noise, it is expected that in case of matching (i.e., the analysed image has been acquired by the device under study), the two feature vectors are more correlated than in the case of no matching (i.e., the analysed image has not been taken with the device under study). Therefore, the role played by the coherence between NCC values derived from different PRNU estimations in the source identification process is studied in this paper. The proposed method has been extensively tested on different publicly available databases. Experimental results show that it contributes to improve the basic correlation-based source identification method by reaching and often outperforming classification results provided by selected competing state of the art methods. In particular, the proposed method seems to provide:

less ambiguities in case of images acquired by different devices of the same model;
more robustness to reference PRNU estimation from natural images (NI) instead of flat fields (FF) images, i.e. the ones whose subject is a uniform and constant background;
robustness to PRNU estimation from images coming from social networks.

The remainder of the paper is the following. Next section presents the proposed method and its theoretical and practical motivations. Section 3 presents some experimental results, comparative studies and discussions. The last section draws the conclusions and provides guidelines for future work.

2 The proposed method

PRNU extraction must be based on a precise image modeling where the role of the noise sources is defined; on the other hand, the model must be not complicated in order to make PRNU estimation feasible. The common model, after some simplifications and assumptions, adopted in the literature is the following [10, 24]

$$ J(\mathbf{x}) = I(\mathbf{x}) + I(\mathbf{x}) K(\mathbf{x}) + N(\mathbf{x}), $$

(1)

where J is the acquired image, x is the pixel location, I is the original image content, K is the PRNU noise component while N includes other noise sources that are independent of K. Hence, K identifies in a unique way the device that took the image J; it is zero-mean and independent of I pointwise. From now on, the dependence on x will be omitted for simplicity.

As it can be observed, if a denoising filter F is applied to J, the residual image, i.e.

$$ R = J-F(J) , $$

(2)

preserves K component.

By considering the first two scenarios described in the introduction, when several images (especially FF images) are available for each device, the reference pattern K of the device can be estimated by combining each single estimation of K that is derived from each available image. Hence, whenever another image from the same device is available, high correlation is expected between its residual R and the reference pattern K of the device. That is why the normalized correlation is used to assess the similarity between K and R, i.e.

$$ \rho(R,K) = \frac{<(R-\bar{R}),(K-\bar{K})>}{\|(R-\bar{R})\| \| (K-\bar{K})\|}, $$

(3)

where < ⋅,⋅ > denotes the inner product and $\bar {*}$ is the mean value of ∗.

More precisely, by denoting with $J_{i,d_{j}}$ the i −th image acquired by the j −th device, $M_{d_{j}}$ the number of images available for the same device and with $R_{i,d_{j}}$ the corresponding residual image estimated as in (2), we have that the reference PRNU for the d_j-th device, namely $K_{d_{j}}$, is

$$ K_{d_{j}} \approx \frac{1}{M_{d_{j}}}\underset{i}{\sum} R_{i,d_{j}}. $$

(4)

Previous equation refers to the ideal case, i.e. whenever $R_{i,d_{j}} \approx I_{i,d_{j}} K_{d_{j}} + N_{i,d_{j}}$, $M_{d_{j}}$ is large, $K_{d_{j}}$ and $I_{i,d_{j}}$ are independent and $I_{i,d_{j}}$ resembles a flat field image [2].

In order to better suppress eventual error sources, a maximum likelihood estimation of the reference PRNU for the d_j-th device [2, 24] can be derived as follows

$$ K_{d_{j}, MLE} \approx \frac{{\sum}_{i=1}^{M_{d_{j}}} R_{i,d_{j}} J_{i,d_{j}}}{{\sum}_{i=1}^{M_{d_{j}}} J_{i,d_{j}}^{2}}. $$

(5)

In this case [24] the following model for the residual image is considered $R_{i,d_{j}} \approx J_{i,d_{j}} K_{d_{j}} + {\varTheta }_{i,d_{j}}$, where ${\varTheta }_{i,d_{j}}$ are noise sources that are supposed to be independent of $K_{d_{j}}$. This kind of estimation holds true even in case of natural (not FF) images.

It is worth observing that, whenever FF images are considered for the single device, $I_{i,d_{j}}$ is almost constant, i.e. $I_{i,d_{j}} (\mathbf {x}) = C_{i,d_{j}} , \forall \mathbf {x}$. As a result,

$$J_{i,d_{j}} = C_{i,d_{j}} + C_{i,d_{j}}K + N_{i,d_{j}} $$

and $K_{d_{j}}$ can be estimated directly from $J_{i,d_{j}}$ as follows

$$ K_{d_{j},FF} = \frac{\bar{J} - \bar{C}}{\bar{C}}, $$

(6)

where $\bar {J} = \frac {1}{M_{d_{j}}} {\sum }_{i} J_{i,d_{j}} = \bar {C}_{i,d_{j}} + \bar {C}_{i,d_{j}} K + \bar {N}_{i,d_{j}}$, $\bar {C}$ is the mean value of $ C_{i,d_{j}}$ and $\bar {N}_{i,d_{j}} \approx 0$.

Better or different variants of the aforementioned estimations for the reference PRNU can be considered in order to prevent eventual denoising artifacts, error in noise source modelling and assumptions, and so on. This kind of operation is commonly denoted in the literature as PRNU enhancement process [2]. In this paper, we focus on the basic estimations described above as we expect a certain amount of coherence between PRNU estimations, independently of the adopted but consistent procedure, as it will be clearer in the sequel.

Despite the variety of distortion sources, denoising procedure represents a crucial step as the successive analyses are based on the residual image. The more the model hypotheses are met, the more consistent PRNU estimation. In particular, the residual R has a noise component and a structural component due to the fact that part of edges and structures are smoothed in the denoising procedure and then they leave traces in the residual image; on the other hand, some noise component remains in the denoised image, so that R may contain only a part of PRNU image—see Fig. 2. As a result, without loss of generality, we can briefly split the residual into two components as follows

$$ R = I_{S} + K_{N}, $$

(7)

where I_S is the structural part still present in R while K_N is the PRNU component in R. The better the denoiser, the less I_S and the more K_N approaches K. This requirement is crucial especially for the single residual R (PRNU image) that has to be compared with the device PRNU (reference PRNU or reference pattern) in order to fix the origin of a given image (candidate image). A method to address this issue is to use only image regions where the aforementioned statement holds true, i.e. smooth and almost flat regions (the ones that do not contain edges or textures) [34]; an alternative solution is to properly weight the similarity measure that is adopted in the classification process, according to edges or textures local density [32]. Unfortunately, in the latter case some settings concerning the threshold to adopt and the best weighting function remain open questions that can influence the final result. In any case, the selection of the denoiser as well as K estimation procedure can considerably change the final classification, as many papers in the literature have demonstrated [9, 22].

2.1 Conditioning of cross-correlation

It is worth noticing that if R denotes the residual extracted from an image whose origin (device d_j) has to be assessed, by using the arguments used in [3] for the denoising problem, (7) and the independence between I_S and K_N provide

$$ \rho(R,K_{d_{k}}) = \frac{<(I_{S} + K_{N})-\overline{(I_{S} + K_{N})},(K_{d_{k}}-\bar{K}_{d_{k}})>}{\sigma_{R} \sigma_{K_{d_{k}}}} = \frac{\sigma_{K_{N},K_{d_{k}}} }{\sigma_{R} \sigma_{K_{d_{k}}}} \quad \forall k, $$

(8)

where $\sigma _{K_{N},K_{d_{k}}}$ is the covariance between K_N and $K_{d_{k}}$, while σ_R and $\sigma _{K_{d_{k}}}$ are the standard deviations of R and $K_{d_{k}}$ respectively. This equation holds true either in the matching case (j = k), i.e whenever K_N is a part of $K_{d_{j}}$, or in the no matching case (j≠k), i.e. whenever K_N is completely independent of $K_{d_{j}}$. This way of writing ρ is interesting as it well shows the two different error sources in source camera identification:

the denoiser, i.e. the term K_N;
PRNU enhancement, i.e. the term $K_{d_{k}}$.

In addition, it allows us to make a simple but crucial observation. Without loss of generality, we consider only the numerator in (8). Since the two terms are expected to be zero-mean, it corresponds to the inner product between K_N and $K_{d_{k}}$, i.e. $<K_{N},K_{d_{k}}>$. It is straightforward to observe that the inner product between two vectors is badly conditioned if the two vectors are orthogonal, while it is well conditioned if the two vectors are linearly dependent. In fact, by denoting with p =< y,x >= y^Tx the inner product between the two vectors x and y, it follows

$$\frac{|\delta p|}{|p|} \leq \frac{\|y^{T}\| \|x\|}{|y^{T} x|} \frac{\|\delta x\|}{\|x\|},$$

where δp is the absolute error for p caused by the absolute error δx for the vector x. The quantity $\frac {\|y^{T}\| \|x\|}{|y^{T} x|}$ resembles the condition number for the computation of p whenever y is fixed and it is exactly the inverse of the cosine of the angle between y and x. As a result, with reference to the numerator of (8), if denoising is accurate and j = k, i.e. $K_{N} \sim K_{d_{j}}$, then the problem is well conditioned; on the contrary, if denoising is accurate but j≠k, the problem is badly conditioned as it is expected that K_N does not share anything with $K_{d_{k}}$. This property still holds whenever the reference PRNU for a given device is estimated using different but consistent estimation strategies. As a result, independently of the way K is estimated, we expect that $\sigma _{R,K_{d_{j}}}$ (matching case) computation is better conditioned and stable than $\sigma _{R,K_{d_{k}}}$ (no matching case) .

In addition, by assuming the estimation of $K_{d_{k}}$ enough accurate, if j = k then $K_{d_{j}}$ is contained in the residual image R; since R = I_S + K_N, we can write $K_{d_{j}} = K_{N} + {K_{N}^{c}}$ and then

$$ \begin{array}{@{}rcl@{}} \rho(R,K_{d_{k}}) &=& \rho(R,K_{d_{j}}) = \frac{\sigma_{K_{N},K_{d_{j}}}}{\sigma_{R} \sigma_{K_{d_{j}}}} = \frac{\sigma_{K_{N}}^{2} + \sigma_{K_{N},{K_{N}^{c}}}}{\sigma_{R} \sigma_{K_{d_{j}}}} \\ &=&\frac{\sigma_{K_{N}}^{2} + \sigma_{K_{N},{K_{N}^{c}}}}{\sqrt{\sigma^{2}_{I_{S}} \sigma^{2}_{K_{d_{j}}} + (\sigma^{2}_{K_{N}} + \sigma_{K_{N},{K_{N}^{c}}} )^{2} + \sigma^{2}_{K_{N}}\sigma^{2}_{{K_{N}^{c}}} - \sigma^{2}_{K_{N},{K_{N}^{c}}} }}\\ &=&\frac{1}{\sqrt{1+\frac{\sigma^{2}_{I_{S}} \sigma^{2}_{K_{d_{j}}} + \sigma^{2}_{K_{N}}\sigma^{2}_{{K_{N}^{c}}} - \sigma^{2}_{K_{N},{K_{N}^{c}}} }{(\sigma_{K_{N}}^{2} + \sigma_{K_{N},{K_{N}^{c}}})^{2}}}}, \end{array} $$

where $\sigma ^{2}_{K_{N}}\sigma ^{2}_{{K_{N}^{c}}} - \sigma ^{2}_{K_{N},{K_{N}^{c}}} \geq 0$ from the Cauchy-Schwarz inequality.

Hence, in case of a perfect denoiser, $\sigma ^{2}_{I_{S}} = 0$ and ${K_{N}^{c}}= 0$, so that ρ approaches 1. Even though it is not feasible in real situations, we expect that it is nearly true in correspondence to flat regions. In this case, as I_S contribution should be minor (as well as ${K_{N}^{c}}$), the argument in the square root in the last line of the previous equation is close to 1. On the contrary, for textured/edges regions we expect a greater contribution from I_S as well as from ${K_{N}^{c}}$ and the argument in the square root is much greater than 1 so that ρ value decreases. As a result, in the matching case we are able to predict, in some sense, the behaviour of ρ whenever estimated in specific regions of the image, i.e. with or without edges or textures. On the contrary, in the no matching case (j≠k) we cannot say anything more about $\sigma _{K_{N},K_{d_{k}}}$ except for the fact that we expect values close to zero in all image regions.

2.2 The proposed source identification method

Aforementioned observations further and more formally motivate the preliminary work presented in [6]. Specifically, in the source camera identification problem, it could be more advantageous to exploit the fact that if an image is acquired by a given sensor, with high probability we are able to measure this match with almost all consistent estimations of the reference pattern (Fig. 3); on the contrary, if the image comes from another device, then we expect more variable and less predictable correlation values whenever different estimations of the reference pattern are considered. In addition, as mentioned in the previous section, in the match case the relation between the ρ values computed on flat regions and on textured regions is expected to be almost insensitive to different but coherent estimations of the reference PRNU. As a result, it would be convenient to adopt this coherence in the identification process. In particular, in this work the coherence between ρ values computed on the whole image, on image flat regions and on textured/edge regions has been analysed and adopted for source identification purposes—see Fig. 4.

More formally, let $K_{d_{k},1}$ and $K_{d_{k},2}$ be two different estimations of $K_{d_{k}}$ for a fixed device d_k and let {P_i}_i= 1,2 the features vectors computed with respect to the i − th estimation of camera fingerprint $K_{d_{k}}$. By denoting with J the candidate image, P_i is the three component vectors whose components are described below:

1.
$\rho (R,K_{d_{k},i})$, i.e. the correlation between the candidate image residual R and the i − th estimation of camera fingerprint $K_{d_{k}}$;
2.
$\rho (R_{flat},K_{d_{k},i,flat})$, i.e. the correlation between the candidate image residual R restricted to flat regions of J and $K_{d_{k},i}$ restricted to the same region;
3.
$\rho (R_{edge},K_{d_{k},i,edge})$, i.e. the correlation between the candidate image residual R restricted to edge regions in J and $K_{d_{k},i}$ restricted to the same region.

Independently of the inner dependency between the similarity metric evaluated in image subregions, we expect that these dependencies are preserved more in the match case whenever $K_{d_{k}}$ estimation slightly changes.

The coherence between the two feature vectors is measured by means of their inner product, i.e.

$$ \tau_{J,K_{d_{k}}} = <P_{1},P_{2}>. $$

(9)

The larger $\tau _{J,K_{d_{k},i}}$, the higher the coherence between the normalized correlation evaluated in different image regions with respect to different estimations of camera fingerprint, and then the higher the probability that J comes from d_k—see Fig. 5.

2.3 The algorithm

The source camera identification algorithm is summarized below.

1.
For each device d_k in the database, estimate the reference pattern $K_{d_{k}}$ using the first predefined estimation mode and let $K_{d_{k},1}$ be the estimation
2.
For each device d_k in the database, estimate the reference pattern $K_{d_{k}}$ using the second predefined estimation mode and let $K_{d_{k},2}$ be the estimation
3.
For each candidate image J,
- apply a predefined denoising filter and estimate the residual image R as in (2);
- apply an edge detection filter for extracting edges/textured regions; extract flat regions as edges/textures complementary regions;
- compute $P_{1} = [\rho (R,K_{d_{k},1}),\rho (R_{edge},K_{d_{k},1,edge}),\rho (R_{flat},K_{d_{k},1,flat})]$ and $P_{2} = [\rho (R,K_{d_{k},2}),\rho (R_{edge},K_{d_{k},2,edge}),\rho (R_{flat},K_{d_{k},2,flat})]$;
- compute $\tau _{J,K_{d_{k}}}$ as in (9);
- compare $\tau _{J,K_{d_{k}}}$ with a predefined threshold value. If $\tau _{J,K_{d_{k}}}$ overexceeds the threshold value, then the image has been acquired by the device d_k with high probability.

Remark

PRNU extraction requires the use of a denoising filter and, as pointed out in the previous sections, it is required to be enough accurate. In this paper we do not focus on the denoising filter and the widely used wavelet-based Mihcak filter [26], as suggested in the pioneering paper [24], has been considered in the experimental results. As a matter of fact, as also shown in some papers in the literature, different and better performing methods could be used, as for example non-local based methods [4, 5, 7, 13], but they could result too much computationally expensive. The wavelet-based denoising method represents a good trade-off between accuracy and required computational effort.

2.4 Weighted coherence measure

As mentioned in the previous section, the inner product defined in (9) conveys some information concerning the relation between the candidate image and a given device. This is the reason why in this paper the inner product has been considered as a discrimination measure in the source identification process. In order to further investigate this index, the following corrections for $\tau _{I,K_{d_{k}}}$ have also been considered

1.
$w(J,K_{d_{k}}) = \rho (R,K_{d_{k},1}) \tau _{J,K_{d_{k}}}$;
2.
$v(J,K_{d_{k}}) = \rho (R_{flat},K_{d_{k},1,flat}) \tau _{J,K_{d_{k}}} $.

In the first case, the classical correlation measure (basic algorithm in [24]) is used as a corrective term for the proposed coherence measure; in the second case, the classical correlation measure restricted to image flat regions has been selected as corrective term. In this latter case better results are expected since measures in flat regions should be more accurate as less affected by denoising artifacts.

3 Experimental results

The proposed method has been tested on publicly available databases in the forensics field, the Dresden Image database [18] and Vision database [30]. The former includes hundred images (natural and flat field) captured by several camera models and devices—a subset of uncompressed images and devices, listed in Tables 1 and 2, have been used in our tests. The latter is composed of videos and images both in the native format and in their social version (Facebook, YouTube, and WhatsApp are considered), from 35 portable devices of 11 major brands. In this paper, Facebook images have been analysed, as listed in Table 3.

Table 1 Selected images and devices from Dresden database

Full size table

Table 2 Selected images and devices from Dresden database

Full size table

Table 3 Selected images and devices from Vision database

Full size table

For the estimation of camera fingerprint, i.e. the reference PRNU $K_{d_{k}}$, we use

$$ K_{d_{k},1} = \frac{1}{M_{d_{k}}}\sum\limits_{i=1}^{M_{d_{k}}} H_{i,d_{k}} $$

(10)

where $ H_{i,d_{k}}(\mathbf {x}) = J_{i,d_{k}}(\mathbf {x}) - C_{i,x_{2}} - C_{i,x_{1}}$, with $C_{n,x_{2}} = \frac {1}{Nrows} {\sum }_{x_{1}} J_{i,d_{k}}(\mathbf {x}) $ and $C_{i,x_{1}} = \frac {1}{Ncols} {\sum }_{x_{2}} (J_{i,d_{k}}(\mathbf {x}) - C_{i,x_{2}}) $, in agreement with [24], while for $K_{d_{k},2}$ a different equation is used according to the available images in order to be more consistent with the image model. Specifically, if FF images are available, $K_{d_{k},2}$ is set equal to (6), as it refers to constant images. On the contrary, if only NI are available, $K_{d_{k},2}$ is set equal to (5), as it is more robust to eventual distortions introduced during the denoising step.

For edges extraction, the standard Canny edge detection algorithm has been selected with Matlab default parameters and a dilation window size equal to 7 has been applied to the output edge map. Even in this case, a classical edge detector has been considered for simplicity. The size of the dilation window has been empirically set as the one that provided the best results on average. As a matter of fact, the dilation factor should be fixed according to the candidate image content. However, the estimation of the best dilation parameter would require an additional computational effort and it is out of the scope of this paper.

For comparative studies the reference pioneering method in [24] (basic method) and the one in [32] have been considered. The former has been selected since we aim at measuring to what extent the proposed method improves the original work. In other words, we are interested in quantifying the benefit in using the proposed coherence measure as alternative to the absolute correlation (plain mode) or as a corrective term (weighted mode) for it. It is worth observing that the plain mode corresponds to the preliminary work presented in [6]. The method in [32] has been selected as it shares the same strategy, i.e. weighting correlation measure according to image regions. However, the method in [32] defines the weight as the local density of textured/edges region. On the contrary, in the proposed method we focus on the inner dependencies between the correlation metric between more or less textured image regions rather than their contribution to the similarity metric. The results have been compared using standard classification indices, as specificity, sensitivity, precision, F1-score and accuracy [15].

The first test is oriented to compare the proposed method, i.e. the inner product $\tau _{J,K_{d_{k}}}$ and its weighted version $w(J,K_{d_{k}})$, with the basic algorithm, i.e. $\rho (R,K_{d_{k},1})$, at a fixed sensitivity value, i.e. the number of true positives and negatives is the same for all methods. As a result, in this case a unique threshold is used for all candidate images and devices, and it has been fixed as the one allowing the predefined sensitivity value. As a consequence, for each method a different threshold has been used. As it can be observed in Tables 4 and 5, both $\tau _{J,K_{d_{k}}}$ and $w(J,K_{d_{k}})$ are able to outperform the basic $\rho (J,K_{d_{k},1})$ in terms of reduced number of false positives. This means that the inner product provides a reduced number of false positives assignments in the second working scenario (image source device could not be in the available set of devices) and the weighted inner product further improves this result. It is also worth observing that the same considerations are valid either the device PRNU is estimated from FF images or from NI.

Table 4 Results in terms of number of true positives (TP), true negatives (TN), false negatives (FN), false positives (FP), precision (Prec), F1 score (F1), specificity (Spec) and accuracy (Acc) provided by the proposed method (inner product, $\tau _{J,K_{d_{k}}}$), its weighted version ($w(J,K_{d_{k}})$) and the basic algorithm in [24] ($\rho (R,K_{d_{k},1})$) at three fixed sensitivity values

Full size table

Table 5 Results in terms of number of true positives (TP), true negatives (TN), false negatives (FN), false positives (FP), precision (Prec), F1 score (F1), specificity (Spec) and accuracy (Acc) provided by the proposed method (inner product, $\tau _{J,K_{d_{k}}}$), its weighted version ($w(J,K_{d_{k}}))$ and the basic algorithm in [24] ($\rho (R,K_{d_{k},1})$) at three fixed sensitivity values

Full size table

In order to stress this point, the ROC (sensitivity vs 1-specificity) curves for the three methods are depicted in Figs. 6 and 7. As it can be observed, the proposed procedure is able to considerably improve the basic method in terms of True Positive Rate (TPR = Sensitivity), especially in correspondence to high specificity values, i.e. low False Positive Rate (FPR = 1-Specificity). In fact, as also outlined in [6], the use of the inner product allows us to reduce the number of false positive assignments.

It is worth observing that we reach the same conclusions if all candidate images are analysed and a unique decision threshold value is applied to each metric for source identification purposes. More precisely, two different thresholds have been selected as follows:

the one corresponding to the M − th highest metric value, with M equal to the total number of candidate images used in the test;
the one able to select the first 5% of the highest metric values (right tail of metric values distribution).

As Fig. 8 shows, the two thresholds are close to the optimal separation point of the monotonic decreasing rearrangement of metric values, i.e. the one that separates the distribution into two groups having different characteristics. Quantitative results are reported in Tables 6 and 7. As it can be observed, the inner product based method performs better than the basic cross-correlation one in correspondence to this optimal point.

Table 6 Results in terms of number of true positives (TP), true negatives (TN), false negatives (FN), false positives (FP), precision (Prec), sensitivity (Sens), F1 score (F1), specificity (Spec) and accuracy (Acc) provided by the proposed method (inner product, $\tau _{J,K_{d_{k}}}$) and the basic algorithm in [24] ($\rho (R,K_{d_{k},1})$) at the threshold level corresponding to 5% of the distribution and the one corresponding to the number of candidate images

Full size table

Table 7 Results in terms of number of true positives (TP), true negatives (TN), false negatives (FN), false positives (FP), precision (Prec), sensitivity (Sens), F1 score (F1), specificity (Spec) and accuracy (Acc) provided by the proposed method (inner product, $\tau _{J,K_{d_{k}}}$ ) and the basic algorithm in [24] ($\rho (R,K_{d_{k},1})$) at the threshold level corresponding to 5% of the distribution and the one corresponding to the number of candidate images

Full size table

In the previous test the inner product shows to convey information concerning the image source, while the global weighting procedure improves the classification results provided by the basic algorithm. To further confirm this fact, the value $\rho (R_{flat},K_{d_{k},1,flat})$ has been considered as weighting coefficient for the inner product as it is expected to be more accurate than the global correlation value. The result is also compared with the method in [32], where a pointwise weighted correlation is employed for classification purposes. The methods are compared in Table 8, where fixed specificity values have been considered.

Table 8 Comparisons in terms of sensitivity for fixed specificity values: 0.990, 0.994 and 0.995

Full size table

As it can be observed, the weighting method always guarantees an improvement and the more consistent the weight the higher the improvement. It is also worth observing that the proposed global weighting procedure is able to reach comparable results to the local weighting procedure in [32]. However, it is interesting to observe that the main contribution of the proposed inner product consists in better addressing the separation problem for some devices for which the correlation measure results more noisy, as the Olympus one in Dresden database. As Fig. 9 shows, there’s not a threshold that allows to assign each candidate image to the analysed device so that many false assignments can occurr. However, it is also possible to observe that inner product results less noisy than cross-correlation. By repeating the previous comparative test device-per-device, whose results are in Table 9, it is evident that the inner product greatly improves the one of the basic algorithm.

Table 9 First three devices for Olympus. Comparisons in terms of sensitivity for fixed specificity values: 0.990, 0.994 and 0.998

Full size table

The same considerations are valid if NI are used for device PRNU estimation instead of FF, as shown in Tables 10 and 11, by confirming the robustenss of the proposed method to the use of less precise estimations of source camera PRNU. In order to stress this fact, Table 10 also contains the values of $\rho (R_{flat},K_{d_{k},1,flat})$, i.e. the normalized correlation restricted to the flat regions of the candidate image. As it can be observed, in this case the correction provided by the inner product allows to increase the discrimination power of the metric, resulting more robust to the different error sources, especially for low false positive rates.

Table 10 Comparisons in terms of sensitivity for fixed specificity values: 0.990, 0.994, 0.995, 0.998, 0.999 and 0.9995

Full size table

Table 11 First three devices for Olympus

Full size table

The proposed procedure showed to be robust to the analysis of candidate images that have been acquired by a social network, as shown in Table 12. In this case, the proposed index allows to improve the basic one whenever used as corrective term for the basic measure.

Table 12 Vision database

Full size table

With regard to the first scenario, i.e. the set of devices contains the one that took the image, the benefit in using the weighted inner product is evident. In this case, for the candidate image J the aim is to have

$$j = argmax_{k} f(R,K_{d_{k}})$$

with d_j the device that took the image, R the residual image associated to J and f the adopted source identification metric, i.e. $\rho (R,K_{d_{k}})$, $\tau _{J,K_{d_{k}}}$, $w(J,K_{d_{k}})$ and $v(J,K_{d_{k}})$ respectively for the basic method [24], the proposed inner product and its two weighted versions. As Tables 13 and 14 show, the four metrics provide comparable results and reach 100% of acceptance rate for most of the candidate images in the database. However, it is worth observing that basic and the inner product metrics benefit from the weighting operations in resolving some ambiguities and instabilities for some critical brands as Olympus and Panasonic. In particular, the weigthing operation allows us to increase the number of correct assignments.

Table 13 Global source identification results (left) for the basic method, the inner product based method, the weighted inner product with the basic method and the weighted inner product restricted to flat regions, in the second working scenario and for the brands listed in the righmost columns

Full size table

Table 14 Confusion matrix for brand classification in the second working scenario for the basic method, the inner product based method, the weighted inner product with the basic method and the weighted inner product restricted to flat regions

Full size table

Finally, with regard to the computational effort, the proposed method inherits the same properties of the basic algorithm in [24]. As a result, the most expensive procedure is the denosing process; the remaining operations, i.e. enhancement methods for reference PRNU estimations and selective correlation values are somewhat inexpensive, real time and user’s independent.

4 Conclusion

In this paper source camera identification problem has been addressed. Even though denoising and enhancement procedures play a crucial role in the whole identification process, the role of the metric used for source assessment is not negligible, especially in the classification procedures. This paper focused on this task. Specifically, the coherence between metric values computed in different but specific regions of the image has been considered and its dependence on camera fingerprint estimation has been studied. The main result has been the use of the inner product as measure of this coherence and the observation that this quantity is better conditioned whenever the analysed candidate image has been acquired by the reference device. The proposed coherence measure has been used as both absolute metric for the source identification process and corrective term for basic existing methods. Experimental results show that, even though in its preliminary version, the use of the proposed coherence contributes to improve the identification process, especially by decreasing the number of false assignments. In addition, it shows some robustness to PRNU estimation from natural images and to candidate images coming from social networks. Future research will be devoted to investigate further on this coherence with particular reference to the study of its dependence on each single component of the whole identification process, such as denoising method, PRNU estimation mode, similarity metric, image region extraction. Finally, a more intensive study concerning its dependence on image manipulation will be also one of the topics of future work.

References

Akshatha KR, Karunakar AK, Anitha H, Raghavendra U, Shetty D (2016) Digital camera identification using PRNU: a feature based approach. Digit Investig 19:69–77
Article Google Scholar
Al-Ani M, Khelifi F (2017) On the SPN estimation in image forensics: a systematic empirical evaluation. IEEE Trans Inf Forensics Secur 12(5):1067–1081
Article Google Scholar
Brunet D, Vrscay ER, Wang Z (2009) The use of residuals in image denoising, lecture notes in computer science book series (LNCS). In: Proceedings of the international conference image analysis and recognition ICIAR 2009, vol 5627, pp 1–12
Bruni V, Vitulano D (2012) Time-scale similarities for robust image de-noising. J Math Imaging Vis 44(1):52–64
Article MathSciNet Google Scholar
Bruni V, Panella D, Vitulano D (2015) Non local means image denoising using noise-adaptive SSIM. In: Proceedings of the IEEE Eusipco 2015
Bruni V, Salvi A, Vitulano D (2019) Joint correlation measurements for prnu-based source identification, lecture notes in computer science book series (LNCS). In: Proceedings of 18th international conference, CAIP 2019, vol 11679
Buades A, Coll B (2005) Morel, A review of image denoising algorithms, with a new one. Multiscale Mod Simul 4(2):490–530
Article Google Scholar
Caldelli R, Amerini I, Tsun Li C (2018) PRNU-based image classification of origin social network with CNN. In: Proceedings of 26th European signal processing conference (EUSIPCO)
Chan LH, Law NF, Siu WC (2013) A confidence map and pixel-based weighted correlation for PRNU-based camera identification. Digit Investig 10(3):215–225
Article Google Scholar
Chen M, Fridrich J, Goljan M, Lukas J (2008) Determining image origin and integrity using sensor noise. IEEE Trans Inf Forensics Secur 3(1):74–90
Article Google Scholar
Chierchia G, Parrilli S, Poggi G, Sansone C, Verdoliva L (2010) On the influence of denoising in PRNU based forgery detection. In: Proceedings of the 2nd ACM workshop on multimedia in forensics security and intelligence
Chierchia G, Poggi G, Sansone C, Verdoliva L (2014) A Bayesian-MRF approach for PRNU-based image forgery detection. IEEE Trans Inf Forensics Secur 9(4):554–567
Article Google Scholar
Dabov K, Foi A, Katkovnik V, Egiazarian K (2007) Image denoising by sparse 3D transformdomain collaborative filtering. IEEE Trans Image Proc 16(8):2080–2095
Article Google Scholar
Farinella GM, Giuffrida MV, Digiacomo V, Battiato B (2015) On blind source camera identification. In: Proceedings of international conference on advanced concepts for intelligent vision systems. Springer, pp 464–473
Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27(8):861–874
Article MathSciNet Google Scholar
Fridrich J (2009) Digital image forensics. IEEE Signal Process Mag 26(2):26–37
Article Google Scholar
Georgievska S, Bakhshi R, Gavai A, Sclocco A, van Werkhoven B (2017) Clustering image noise patterns by embedding and visualization for common source camera detection. Digit Investig 23:22–30
Article Google Scholar
Gloe T, Bhme R (2010) The Dresden image database for benchmarking digital image forensics. J Digit Forensic Pract 3(2–4):150–159
Article Google Scholar
Huang Y, Zhang J, Huang H (2015) Camera model identification with unknown models. IEEE Trans Inf Forensics Secur 10(12):2692–2704
Article Google Scholar
Kang X, Li Y, Qu Z, Huang J (2012) Enhancing source camera identification performance with a camera reference phase sensor pattern noise. IEEE Trans Inf Foren Sec 7(2):393–402
Article Google Scholar
Korus P, Paweł, Huang J (2016) Multi-scale analysis strategies in PRNU-based tampering localization. IEEE Trans Inf Forensics Secur 12(4):809–824
Article Google Scholar
Lawgaly A, Khelifi F (2017) Sensor pattern noise estimation based on improved locally adaptive DCT filtering and weighted averaging for source camera identification and verification. IEEE Trans Inf Forensics Secur 12(2):392–404
Article Google Scholar
Li R, Li CT, Guan Y (2018) Inference of a compact representation of sensor fingerprint for source camera identification. Pattern Recognit 74:556–567
Article Google Scholar
Lukas J, Fridrich J, Goljan M (2006) Digital camera identification from sensor pattern noise. IEEE Trans Inf Forensics Secur 1:205–214
Article Google Scholar
Marra F, Poggi G, Sansone C, Verdoliva L (2017) Blind PRNU-based image clustering for source identification. IEEE Trans Inf Forensics Secur 12(9):2197–2211
Article Google Scholar
Mihcak M, Kozintsev I, Ramchandran K (1999) Spatially adaptive statistical modeling of wavelet image coefficients and its application to denoising. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing, Phoenix, AZ, USA, vol 6, pp 3253–3256
Peng F, Shi J, Long M (2013) Comparison and analysis of the performance of PRNU extraction methods in source camera identification. J Comput Inf Syst 9(14):5585–5592
Google Scholar
Rao Q, Wang J, Zhang L (2016) Enhancing source camera identification based on multiplicative denoising filter. In: IEEE Trustcom/BigDataSE/ISPA, Tianjin, pp 983–988
Redi JA, Taktak W, Dugelay J (2011) Digital image forensics: a booklet for beginners. Multimed Tools Appl 51:133–162
Article Google Scholar
Shullani D, Fontani M, Iuliani M, Al Shaya O, Piva A (2017) Vision: a video and image dataset for source identification. EURASIP J Inf Secur 2017:15
Article Google Scholar
Thaia TH, Retraintband F, Cogranne R (2015) Camera model identification based on the generalized noise model in natural images. Digit Signal Process 00:1–15
Google Scholar
Tiwari M, Gupta B (2018) Image features dependant correlation-weighting function for efficient PRNU based source camera identification. Forensic Sci Int 285:111–120
Article Google Scholar
Tiwari M, Gupta B (2018) Efficient PRNU extraction using joint edge-preserving filtering for source camera identification and verification. In: Proceedings of IEEE applied signal processing conference (ASPCON
Tiwari M, Gupta B (2019) Enhancing source camera identification using weighted nuclear norm minimization de-noising filter. In: Bhatia S, Tiwari S, Mishra K, Trivedi M (eds) Advances in computer communication and computational sciences, advances in intelligent systems and computing, vol 760. Springer, Singapore, pp 281–288
Tuama A, Comby F, Chaumont M (2016) Camera model identification with the use of deep convolutional neural networks. In: Proceedings of IEEE international workshop on information forensics and security (WIFS) 2016
Uhl A, Holler Y (2012) Iris-sensor authentication using camera PRNU fingerprints. In: 2012 5th IAPR international conference on biometrics (ICB), New Delhi, pp 230–237
Valsesia D, Coluccia G, Bianchi T, Magli E (2017) User authentication via PRNU-based physical unclonable functions. IEEE Trans Inf Forensics Secur 12(8):1941–1956
Article Google Scholar
Vijaya Kumar BVK, Hassebrook L (1990) Performance measures for correlation filters. Appl Opt 29:2997–3006
Article Google Scholar
Xu B, Wang X, Zhou X, Xi J, Wang S (2016) Source camera identification from image texture features. Neurocomputing 207:131–140
Article Google Scholar
Zhao Y, Zheng N, Qiao T, Xu M (2019) Source camera identification via low dimensional PRNU features. Multimed Tools Appl 78:8247–8269
Article Google Scholar

Download references

Funding

Open Access funding provided by Università degli Studi di Roma La Sapienza

Author information

Authors and Affiliations

Department of Basic and Applied Sciences for Engineering, Sapienza Rome University, Via Antonio Scarpa 16, 00161, Rome, Italy
Vittoria Bruni, Michela Tartaglione & Domenico Vitulano
Institute for Calculus Applications, Italian National Research Council, Via dei Taurini 19, 00185, Rome, Italy
Domenico Vitulano

Authors

Vittoria Bruni
View author publications
You can also search for this author in PubMed Google Scholar
Michela Tartaglione
View author publications
You can also search for this author in PubMed Google Scholar
Domenico Vitulano
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vittoria Bruni.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Bruni, V., Tartaglione, M. & Vitulano, D. Coherence of PRNU weighted estimations for improved source camera identification. Multimed Tools Appl 81, 22653–22676 (2022). https://doi.org/10.1007/s11042-020-10477-5

Download citation

Received: 14 April 2020
Revised: 09 November 2020
Accepted: 29 December 2020
Published: 02 February 2021
Issue Date: July 2022
DOI: https://doi.org/10.1007/s11042-020-10477-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Coherence of PRNU weighted estimations for improved source camera identification

Abstract

Similar content being viewed by others

Complexity Based Sample Selection for Camera Source Identification

Source Camera Identification Based on Guided Image Estimation and Block Weighted Average

Joint Correlation Measurements for PRNU-Based Source Identification

1 Introduction