# A Novel Approach for Target Detection and Classification Using Canonical Correlation Analysis

## Authors

- First Online:

- Received:
- Revised:
- Accepted:

DOI: 10.1007/s11265-011-0625-7

- Cite this article as:
- Wang, W., Adalı, T. & Emge, D. J Sign Process Syst (2012) 68: 379. doi:10.1007/s11265-011-0625-7

- 2 Citations
- 213 Views

## Abstract

We present a novel detection approach, detection with canonical correlation (DCC), for target detection without prior information on the interference. We use the maximum canonical correlations between the target set and the observation data set as the detection statistic, and the coefficients of the canonical vector are used to determine the indices of components from a given target library, thus enabling both detection and classification of the target components that might be present in the mixture. We derive an approximate distribution of the maximum canonical correlation when targets are present. For applications where the contributions of components are non-negative, non-negativity constraints are incorporated into the canonical correlation analysis framework and a recursive algorithm is derived to obtain the solution. We demonstrate the effectiveness of DCC and its nonnegative variant by applying them on detection of surface-deposited chemical agents in Raman spectroscopy.

### Keywords

DetectionClassificationCanonical correlation analysis## 1 Introduction

*N*× 1 vector

**x**is a single observation,

**v**a noise vector,

**a**

_{t}and

**a**

_{g}are mixing coefficients. Here \(\mathbf{G}\equiv[\mathbf{g}_1, \mathbf{g}_2, \cdots, \mathbf{g}_{M}]\in \mathcal{R}^{N\times M}\) is a matrix of interference, or background components, and \(\mathbf{T}\equiv[\mathbf{t}_1, \mathbf{t}_2, \cdots, \mathbf{t}_{L}]\in \mathcal{R}^{N\times L}\) is a matrix of components in the target library. In most cases

*N*> >

*M*and

*L*.

**G**is known, matched subspace detectors (MSD) [4, 6] give the generalized likelihood ratio solution as [1]:

*oblique*projection with range spaces \(\langle \mathbf{G}\rangle\) and null space \(\langle \mathbf{T}\rangle\). In this paper, we use chevrons, e.g., \(\langle \mathbf{T} \rangle\), to denote the subspace spanned by the components in

**T**.

**G**[7–10, 19]. In [3], least squares (LS) and non-negative least squares (NNLS) methods have been used for target detection in Raman spectroscopy:

Blind source separation techniques have also been investigated as data-driven detection approaches, such as principal component analysis (PCA), non-negative matrix factorization (NMF), and independent component analysis (ICA), which have shown satisfactory performance for a number of cases (see e.g. [8, 9]). Using ICA, the decision can be made by checking correlations between the extracted components and the target. The limitation of ICA is that the mixing components have to be independent from each other, otherwise, components are split into multiple ones, decreasing the correlation with the target—if it exists in the mixture.

^{1}we propose a data-driven detection method for target detection and classification that does not rely on interference information. Instead of using a single observation, we use multiple observations in the detection. The hypothesis testing problem can be represented as:

**X**≡ [

**x**

_{1},

**x**

_{2}, ⋯ ,

**x**

_{B}] is an

*N*×

*B*observation matrix where each

**x**

_{i}is a single observation. \(\mathbf{A}_g\in \mathcal{R}^{M\times B}\) is the mixing matrix of coefficients corresponding to the background components, \(\mathbf{A}_t\in \mathcal{R}^{L\times B}\) the mixing coefficient of

**T**, and

**V**an

*N*×

*B*noise matrix.

*N*>

*M*,

*L*, and

*B*. Here, \(\mathcal{H}_0\) denotes the absence of the target, and \(\mathcal{H}_1\) its presence.

We use the canonical correlations [12] between the whole target library **T** and the mixtures **X** as the detection index, and use the coefficients of canonical vectors to determine which components are present in the mixtures. Hence both the detection and the classification problems can be solved with this approach. We show that, by using additional observations in canonical correlation analysis (CCA), the negative influence of unknown interference components can be mitigated in detection, thus enhancing the overall detection performance. An approximate distribution of the maximum canonical correlation in DCC is derived when targets are present and the noise follows a Gaussian distribution. The distribution is in terms of *beta* functions with parameters determined by the target library and noise variance, hence it can be used to provide a guidance for choosing the decision threshold in practice.

We also develop a non-negative CCA algorithm to take into account the fact that, in some applications such as Raman spectroscopy and image processing, contributions of mixing components can only be non-negative. When this is the case, it is natural to expect a better detection performance with non-negative CCA (DNCC) by constraining all elements in the canonical vector of the target set to be non-negative.

The proposed detection methods are used in the application to Raman spectroscopy. Raman spectroscopy has been shown to be a powerful technique for non-contact and nondestructive detections and identifications. It uses a laser to probe the vibrational energy levels of a molecule or crystal, thus provides information on molecular structure and chemical composition of materials. In this paper, we consider the application of Raman spectroscopy on non-contact detections of surface-deposited chemical agents, which is particularly useful for detecting environmentally hazardous chemicals [3]. Simulation results demonstrate the effectiveness of the DCC and DNCC methods and their relationship to related methods.

## 2 Canonical Correlation Analysis

**a*** and

**b***, that maximize the correlation

*ρ*= corr(

**Xa**,

**Yb**), such that

**C**

_{xy}is a sample covariance matrix between

**X**and

**Y**, in which the

*ij*-th element is the sample correlation between \(\mathbf{x}_i=[x_{1i}, x_{2i}, \cdots, x_{Ni}]^T\) and \(\mathbf{y}_j=[y_{1j}, y_{2j}, \cdots, y_{Nj}]^T\), i.e.,

**C**

_{yx}is the sample covariance matrix between

**Y**and

**X**, and

**C**

_{xx}and

**C**

_{yy}are the covariance matrices of

**X**and

**Y**, respectively.

The square roots of the eigenvalues obtained from Eq. 6 or Eq. 7 are called *canonical correlations*, and the vectors **a*** and **b*** *canonical vectors*.

## 3 Detection Using Canonical Correlation

In the target detection problem, we need to examine the relationship between the observation data set and the target library. Since the canonical correlation analysis provides information on the closeness of two sets of vectors, we investigate using canonical correlation analysis for the target detection problem.

**X**and

**T**is given by

Under \(\mathcal{H}_0\):

Since**X**=**GA**_{g}, when the subspaces spanned by**G**and**T**are orthogonal,Note that the orthogonality condition is just a simplification to emphasize the general idea, and not a requirement of the DCC method.$$\rho=0.$$Under \(\mathcal{H}_1\):

**X**=**TA**_{t}+**GA**_{g}, thenFor example, assume$$\rho=1.$$**t**_{j}is the present target component, i.e., \(\mathbf{X}=\mathbf{t}_j\mathbf{a}_t^j+\mathbf{G}\mathbf{A}_g=\mathbf{SA}\), where**S**≡ [**t**_{j}**G**] and \(\mathbf{A}\equiv\left[ \begin{array}{l} \mathbf{a}_t^j \\ \mathbf{A}_g \end{array} \right],\) an example solution is**a*** =**A**^{ − 1}[1, 0, ⋯ , 0]^{T}, and \(\mathbf{b}^*=[0,\cdots,1_{(j)},\cdots,0]^T\), given that**A**is a non-singular square matrix. This is because that$$ \mathbf{X}\mathbf{a}^*=[\mathbf{t}_j \mathbf{G}]\mathbf{Aa}^*=[\mathbf{t}_j, \mathbf{g}_1, \cdots, \mathbf{g}_M] \left[ \begin{array}{l} 1 \\ 0\\ \vdots \\ 0 \end{array} \right]=\mathbf{t}_j,~{\rm and} $$thus$$ \mathbf{T}\mathbf{b}^*=[\mathbf{t}_1, \cdots,\mathbf{t}_j, \cdots, \mathbf{t}_L] \left[ \begin{array}{l} 0 \\ \vdots \\ 1_{(j)} \\ \vdots \\ 0 \end{array} \right]=\mathbf{t}_j, $$$$\rho={\rm corr}(\mathbf{t}_j,\mathbf{t}_j)=1.$$

**b**indicates the index of component that is present in the mixture in

**X**.

**b*** might not be unique when components in

**T**are not linearly independent. However, this could be mitigated when the condition of

**T**is improved by approaches such as subspace partitioning [21].

If an interference component, i.e., a column in **G** is linearly dependent to those in **T**, then its presence may lead to a false alarm for the dependent target components. Therefore, DCC requires that the columns of **G** and **T** are not linearly dependent, which is a reasonable assumption in most applications.

*Geometric Interpretation*We demonstrate the general idea for the mitigation of interferences using additional observations in DCC in Fig. 1. In a geometric point of view, in subspace based detectors, decisions are always based on the

*distance*between the observations and the target subspace. A good detector yields a large distance, i.e., discrimination power, when targets are absent and a small one when targets are present. Here, we assume a noiseless environment, thus observations only consist of components from \(\langle \mathbf{T} \rangle\) and \(\langle \mathbf{G} \rangle\). Let \(\mathbf{x}_0\in \langle \mathbf{G} \rangle\) be the observation when targets are absent. Then \({\mbox{\boldmath $\delta$}}_0=\mathbf{x}_0-\hat{\mathbf{x}}_0\), where \(\hat{\mathbf{x}}_0\) is the orthogonal projection of

**x**

_{0}onto \(\langle \mathbf{T} \rangle\), is a large distance reaching its maximum when \(\langle \mathbf{G} \rangle\) is orthogonal to \(\langle \mathbf{T} \rangle\). Note that Fig. 1 shows the case where there is only one component in

**G**.

When targets are present, in the ideal case, a good detector should yield zero distance since the noise effects are neglected. Let \(\mathbf{x}_1\in \langle \mathbf{T},\mathbf{G} \rangle\) be an observation when targets are present, as shown in Fig. 1. The minimum distance between **x**_{1} and \(\langle \mathbf{T} \rangle\) is \({\mbox{\boldmath $\delta$}}_1=\mathbf{x}_1-\hat{\mathbf{x}}_1\), where \(\hat{\mathbf{x}}_1\) is the orthogonal projection of **x**_{1} onto \(\langle \mathbf{T} \rangle\). The contribution of **G** in **x**_{1} cannot be avoided, as in Eq. 3, since \(\langle \mathbf{G} \rangle\) is unknown.

In DCC, with the addition of another observation \(\mathbf{x}_2\in \langle \mathbf{T},\mathbf{G} \rangle\), zero distance under \(\mathcal{H}_1\) can be obtained even without the knowledge of \(\langle \mathbf{G} \rangle\). The canonical correlation is *cosine* of the minimum angle between the two subspaces. When only **x**_{1} is used, *α*_{1} > 0 because of the existence of components from \(\langle \mathbf{G} \rangle\); when both **x**_{1} and **x**_{2} are used, however, the minimum angle between \(\langle \mathbf{x}_1,\mathbf{x}_2 \rangle\) and \(\langle \mathbf{T} \rangle\) is equal to zero with the solution of \(\hat{\mathbf{x}}=\hat{\mathbf{t}}= \langle \mathbf{x}_1,\mathbf{x}_2 \rangle \bigcap \langle \mathbf{T} \rangle,\) where \(\hat{\mathbf{x}}\) is a linear combination of **x**_{1} and **x**_{2}, and \(\hat{\mathbf{t}}\) is a linear combination of **t**_{i}, *i* = 1, ⋯ , *L*. Thus, the DCC detector can yield a zero *distance* without using information on \(\langle \mathbf{G} \rangle\), i.e., **x**_{2} provides an additional coordinate basis for the projection to make up for the missing information of **G** when **x**_{1} and **x**_{2} are combined together to seek the maximum correlation with the target subspace. Hence, the maximum canonical correlation between **X** and **T** is a quantity that reflects the existence of components of **T**, and is invariant to the interference components of **G** in **X**. In general, without using the knowledge of **G**, the DCC detector yields a large distance when targets are absent and zero distance when targets are present. Note that zero distance means a correlation value of one between the two vectors.

There are min (*B*,*L*) canonical correlations obtained by solving the eigenvalue problem in Eq. 6 or Eq. 7 when **X** and **T** are full rank. The solution looks like it is unique because Fig. 1 can only show a 3D scenario. It would be a multidimensional space in geometry for multiple components, and thus there will be more intersections. When there is more than one target component in **X**, it can be expected that there are other high values among all canonical correlations besides the maximum one, and we can use the number of the high canonical correlations as an indicator for the number of existing target components.

**T**, i.e.,

**T**=

**t**. This is the case since the maximum correlation between

**t**and \(\langle \mathbf{X} \rangle\) is equal to the correlation between

**t**and its orthogonal projection onto \(\langle \mathbf{X} \rangle\), which is exactly the

**X**(

**X**

^{T}

**X**)

^{ − 1}

**X**

^{T}

**t**part in Eq. 9.

## 4 DCC with Non-negativity Constraints

In certain applications, such as Raman spectroscopy and image processing, the contributions of mixing components can only be non-negative. When this is the case, it is natural to expect a better detection performance if we constrain all elements in the canonical vector of the target set to be non-negative.

The above problem can be solved numerically using a general constrained optimization algorithm, such as the augmented Lagrange method. Since inequality constraints are usually computationally expensive to incorporate in numerical computation, a more efficient procedure is desirable for the given problem. We develop a non-negative canonical correlation algorithm based on the canonical correlation analysis on two convex polyhedral cones [17].

*C*(

**s**) is generated by a finite number of vectors. There exist vectors

**s**

_{1},

**s**

_{2}, ⋯ ,

**s**

_{k}of

**R**

^{n}such that any element

**x**of

*C*(

**s**) can be written as

*ϕ*

_{1}, ⋯ ,

*ϕ*

_{k}are non-negative.

An algorithm of canonical correlation analysis on two convex polyhedral cones is given in [17], where a pair of vectors \(\hat{\mathbf{x}}\) and \(\hat{\mathbf{y}}\) are computed in the cones *C*(**X**) and *C*(**Y**), respectively, such that the correlation between \(\hat{\mathbf{x}}\) and \(\hat{\mathbf{y}}\) is maximized. In the problem stated in Eq. 10, however, the non-negativity constraint is imposed only on the subspace spanned by **Y**. In other words, using NCCA for detection, we seek the solution of \(\hat{\mathbf{y}}\) in the cone *C*(**Y**) and \(\hat{\mathbf{x}}\) in an unlimited subspace spanned by **X**, denoted by *D*(**X**). Hence the NCCA algorithm is developed for a single cone for the problem given in Eq. 10.

Before introducing the NCCA algorithm, we present the following proposition [17]:

**Proposition**

*Let C** be a convex polyhedral cone, ***x*** a vector in ***R**^{n}*, ***y*** the projection of ***x*** into C**, and ***z*** any vector of C**. Then the maximum correlation corr(***x**, **z***) is reached at ***z** ∝ **y***, i.e., ***z** = *α***y*** for any scalar α**.*

*P*

_{C}(·) is the projection operator on cone

*C*(

**Y**) and \(P_{\mathbf{D}}(\cdot)=\mathbf{X}(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\) is the orthogonal projection operator on subspace

*D*(

**X**).

We implement *P*_{C} using the non-negative least squares algorithm in [18, 19].

The NCCA algorithm is described as follows:

*Non-negative canonical correlation analysis algorithm*

- 1.
Initialize of \(\hat{\mathbf{y}_0}\) with \(\hat{\mathbf{y}}_0=\frac{1}{L}\sum_{i=1}^L{\mathbf{y}_i}\),

- 2.
\(\hat{\mathbf{x}}_i=P_{\mathbf{D}}(\hat{\mathbf{y}}_{i-1})\), and normalize \(\hat{\mathbf{x}}_i\),

- 3.
\(\hat{\mathbf{y}}_i=P_{\mathbf{C}}(\hat{\mathbf{x}}_{i-1})\), and normalize \(\hat{\mathbf{y}}_i\),

- 4.
Stop if both \(\|\hat{\mathbf{x}}_i-\hat{\mathbf{x}}_{i-1}\|\) and \(\|\hat{\mathbf{y}}_i-\hat{\mathbf{y}}_{i-1}\|\) are less than a given tolerance; if not, go to step 2.

## 5 Distribution of the Maximum Canonical Correlation in DCC

In this section, we study the distribution of the maximum canonical correlation in Eq. 8 based on the linear mixture model. Even though the distribution of sample canonical correlations between two normally distributed random vectors has been well studied [13–15], in DCC, the maximum canonical correlation is between a target data set **T** and an observation data set **X** = **TA**_{t} + **GA**_{g} + **V** = **SA** + **V**, where **S** = [**T**^{s}**G**^{s}] and \(\mathbf{A}\equiv\left[ \begin{array}{l} \mathbf{A}_t^s \\ \mathbf{A}_g^s \end{array} \right].\) Here **T**^{s} is the subset of **T** that is present in the mixture, and so is **G**^{s}. \(\mathbf{A}_t^s\) and \(\mathbf{G}_t^s\) are their corresponding mixing coefficients. We assume that all the elements in the noise matrix **V** follow a Gaussian distribution \(\mathcal{N}\left(0,\sigma^2\right)\), and are independent and identically distributed (i.i.d.). Then **X** has the distribution of \(\mathcal{N}\left(\mathbf{SA},\sigma^2\mathbf{1}_{N\times B}\right)\), where **1**_{N × B} is an *N* × *B* matrix of all ones. The elements of **X** are not i.i.d., and **T** is a deterministic matrix instead of being a sample of a random vector. Hence, the distribution results of two random vectors in [13–15] cannot be directly applied and we do need to derive the distribution of the maximum canonical correlation given in Eq. 8 for DCC. Note that, in DCC, the decisions are based on the proximity of the detection index *ρ* to one, therefore, we are interested in the distribution of *ρ* under \(\mathcal{H}_1\).

**A**is an orthonormal matrix (if it is not, observations can be whitened so that

**A**is equivalent to an orthonormal matrix), hence, \(\mathbf{V}'\triangleq \mathbf{V}\mathbf{A}^{-1}\) has the same distribution as

**V**. When the target is present, say

**s**

_{1}=

**t**

_{1}, then

**S**= [

**t**

_{1},

**s**

_{2}, ⋯ ,

**s**

_{M}]. Here

**s**

_{2}, ⋯ ,

**s**

_{B}could be either target or interference components. As discussed in Section 2, the maximum canonical correlation

*ρ*will assume a value close to one, and there exists a solution in which the contribution of

**t**

_{1}is dominant compared to the other mixing components. Hence, we can approximate

*ρ*by the

*multiple correlation*between

**y**=

**t**

_{1}+

**v**

_{1}′ and

**T**, i.e.,

*R*

^{2}given that \(\mathbf{y}\sim\mathcal{N}\left(\mathbf{t}_1,\sigma^2\mathbf{I}\right)\) is a non-central F-distribution, which is derived in the Appendix as

*beta*function, and

*λ*is a scalar determined by

**T**and

*σ*

^{2}, as defined in Eq. 20. Note that,

**T**is a deterministic matrix and

**y**is a random vector centered at

**t**

_{1}, thus, the existing work on the distribution of the multiple correlation can not be used in this case [16].

*R*

^{2}is obtained as:

**T**with dimension 200 × 20, and insert a column of

**T**,

**t**

_{1}, in a randomly generated 200 × 5 matrix

**S**from a Gaussian distribution such that

**S**= [

**t**

_{1},

**s**

_{2}, ⋯ ,

**s**

_{5}]. Then

**S**is multiplied by a 5 × 5 orthonormal mixing matrix

**A**, and noise is added to

**X**=

**SA**for different signal-to-noise-ratio (SNR) with respect to the generated data

**x**

_{i},

*i*= 1, ⋯ , 5. The SNR is defined as:

**x**

_{i}, and

*σ*

^{2}is the variance of the noise. In Fig. 3, the curve of the non-central F-distribution is drawn using Eq. 14,

*ρ*

^{2}is square of the maximum canonical correlation calculated with Eq. 11, i.e., the detection index used in DCC, and

*R*

^{2}is an approximation of

*ρ*

^{2}calculated with Eq. 13. The histograms of

*R*

^{2}and

*ρ*

^{2}are obtained from 10,000 simulation runs with different mixing matrices

**A**and noise matrices

**V**for each run. We can see that the derived non-central F-distribution gives a good approximation for

*R*

^{2}and

*ρ*

^{2}. The difference between

*R*

^{2}and

*ρ*

^{2}is slight and thus justify the approximation in Eq. 12.

## 6 Simulation Results in Raman Spectroscopy

We have a library of 62 Raman spectra, \(\mathcal{S}=\left\{\mathbf{s}_1,\cdots,\mathbf{s}_{62}\right\},\) where the first 50 are spectra of target chemicals of interest, and the last 12 are spectra of possible background materials, i.e., the interference components. The frequency range of the Raman spectra we use is [500, 2800] cm^{ − 1} represented by a vector with 903 sampling points. The data were collected and measured by the Edgewood Chemical and Biological Center of US Army [20].

To evaluate the detection performance, we plot receiver operating characteristic (ROC) curves for all detectors. *P*_{FA} is the probability of false alarm, or 1−specificity, and *P*_{D} is the probability of detection, or sensitivity. The area under ROC curve measures *discrimination*, which is the ability of the test to make correct decisions. The discrimination values are given in each ROC plot. Each curve is drawn using 200 runs in this paper.

**T**

^{s}is chosen from \(\left\{\mathbf{s}_1,\cdots,\mathbf{s}_{50}\right\},\) and

**G**

^{s}from \(\left\{\mathbf{s}_{51},\cdots,\mathbf{s}_{62}\right\}.\) The coefficients

**a**

_{g}and

**a**

_{t}are randomly generated from a uniform distribution in the range [0, 1] for each detection run. The noise vector

**v**is generated using a Gaussian distribution with zero mean, and its variance

*σ*

^{2}is chosen according to desired SNR with respect to the generated data

**x**.

We investigate the performance of DCC and DNCC detectors along with MSD and NNLS detectors for comparison. Both the MSD and NNLS detectors are performed on a sample by sample basis, while the DCC and DNCC are performed on a block of samples. Note that the NNLS, DCC and DNCC detectors do not rely on the interference information, thus only **T** is used in their implementations. For MSD, we let **G** = **G**^{s} in Eq. 2, which provides a significant advantage for MSD over NNLS and DCC/DNCC. In practice, however, usually the interference is either unknown or its estimation is not practical. When the actual interference is missing in the MSD detector, it might collapse in detection. Therefore, a practical implementation of MSD is to include all the possible interferences. In our simulation, we implement an MSD using all the interference components in the library as the interference matrix, i.e., **G** = [**s**_{51}, ⋯ , **s**_{62}] in Eq. 2, denoted by MSD-L.

Another variation of MSD is to use a block of samples as DCC/DNCC detectors. It can be implemented by replacing **x**, **T** and **G** in Eq. 2 with **1**_{B} ⊗ **x**, **I**_{B} ⊗ **T** and **I**_{B} ⊗ **G**, respectively, where ⊗ is the Kronecker product operator, *B* the number of observations used, **I**_{B} an identity matrix of size *B*, and \(\mathbf{1}_B=[1,\cdots,1_B]^T\). This detector is denoted by MSD-LBlk, where all possible interference components are included in **G**, as in the MSD-L detector.

In addition, we implement a detector using single correlations (DCOR), in which the correlations between the observation and each component in the library are first calculated. Then the maximum correlation is used as the detection index, and the corresponding component is selected as the index of component present in the mixture. Note that the interference information is not used in DCOR as in DCC and DNCC.

We also implement the LS detector in Eq. 3, which is outperformed by the NNLS detector. Thus we do not include the result in this paper.

**s**

_{10}is the mixing chemical, and

**s**

_{54}is the background material. We use two observations for applying DCC and DNCC. The zero-mean Gaussian noise vector

**v**is generated with a variance such that the SNR is 2 dB.

In Fig. 5a and b, we can see a significant improvement of discrimination values for DCC and DNCC detectors over other data-driven detectors such as NNLS and DCOR. As described in Section 3, this is because the interference information embodied in the observation sample block is utilized through the canonical correlation analysis in DCC, hence mitigating the effect of interference in detection. While in NNLS and DCOR, because the detection is performed on a sample-by-sample basis, the interferences from the background cannot be removed. DNCC performs better than DCC as the non-negativity constraints help eliminating the effects of interferences from the rest of the target library.

It is worth noting that MSD also provides a good performance in Fig. 5b, which is not surprising because it has the particular advantage of knowing the exact mixing interference component. However, when more interference components are included, MSD-L detector performs poorly in this experiment. MSD-LBlk gives better performance than MSD-L since it uses more samples for detection. Compared to other detectors, an important issue is the dimensionality of MSD-LBlk implementation. Let *M* be the block size, then the complexity of MSD-Blk is *M*^{3}*O*(*n*^{3}) while the complexity of MSD and DCC detectors is *O*(*n*^{3}). Thus, the computational cost of MSD-LBlk is significantly higher than other detectors, which makes it impractical for online type applications.

## 7 Conclusion

In this paper, we propose a new detection method, DCC, for data-driven target detection with unknown interferences. Using the maximum canonical correlation and vectors between the mixtures and the target library, we can determine if there is any library component present in the mixtures, and pick those present ones from a given library. We also develop a non-negative canonical correlation algorithm for applications where only non-negative contributions exist in the mixtures, such as the Raman spectroscopy application presented in this paper. The distribution of the maximum canonical correlation is derived when targets are present, which can provide a guidance for determining the threshold in detection.

In simulations, we studied the performance of DCC and DNCC in Raman spectroscopy for detection of surface-deposited chemical agents. The results demonstrate the effectiveness of the proposed methods. The detection performance of DCC, however, degrades if the components in the target set are highly correlated. The performance of DCC and DNCC can be further improved by partitioning the whole target set into some subsets with low inter/intra correlations [21], as demonstrated in [22].