1 Introduction

Constrained adaptive filter (CAF) possesses error correction property that prevents the accumulation of quantization errors that occur in digital implementations [4, 11, 12]. Benefiting from this excellent property, the CAF has received much attention in recent years [1, 17, 26] and various constrained adaptive filtering algorithms (CAFAs) have been developed, which have different applications such as linear phase system identification, spectral analysis, interference cancellation in direct-sequence code-division multiple access (DS-CDMA), and beamforming [7, 13, 23]. CAFA is the process of translating the application-specific requirements into a set of linear constraint equations that the coefficients must satisfy and incorporating them into the solution [8, 11]. It is equivalent to the coefficient updates are performed in a subspace that is orthogonal to the subspace spanned by a constraint matrix. Thus CAFA can improve the robustness of the solution or obviate a training phase. In general, these constraints are deterministic and are derived from the priori knowledge of the system. Examples include spreading codes in blind multiuser detection, and linear phase in system identification [20, 25, 27].

These proposed CAFAs can be broadly classified into two main categories: the first category is stochastic gradient (SG) based algorithms [8, 24, 29], such as the constrained least mean square (CLMS) algorithm [7] and the constrained affine projection algorithm (CAPA) [9, 10]. The second type is least squares (LS) based algorithms [5, 6], such as the constrained recursive least squares (CRLS) algorithm and the constrained conjugate gradient (CCG) algorithm [3]. The former class of algorithms are usually simpler, more robust, more computationally efficient and easier to implement than the latter. Therefore, the research in this paper focuses on the first class of CAFAs. Among the SG-based constraint algorithms, CLMS is the simplest and least computationally intensive, but its convergence speed is closely related to the eigenvalue spread (or spectral dynamic range) of the autocorrelation matrix of the input signal. When the input signals are correlated signals, the eigenvalue spread is high and the convergence of the CLMS algorithm slows down. The popular CAFAs that can significantly accelerate the convergence of CLMS are the CAPA and its variants. The CAPA improves the convergence speed of the algorithm through the data reuse strategy, but the direct matrix inversion (DMI) operation in its updating formula leads to high computational complexity, which is unfavorable for practical applications. For this reason, a constrained affine projection like (CAPL) algorithm was proposed [9], which does not have DMI operation reduces the computational complexity, due to the elimination of the constraint that the a posteriori error vector is zero.

Another class of SG-based algorithms suitable for processing colored input signals is the normalized subband adaptive filter (NSAF) algorithm [15, 18, 19]. It reduces the spectral dynamic range of the input signal by decomposing the signal into subband domains, thus obtaining a fast convergence speed with colored inputs [14, 21, 28]. And its computational complexity is greatly reduced compared to AP algorithms. However, to the best of our knowledge, NSAF is not currently used to solve the constrained filtering problem under colored inputs. In view of which, in this paper, we propose the constrained normalized subband adaptive filtering algorithm, and the following works are carried out:

  1. (1)

    Using a subband filter with the multiband structure to whiten the colored inputs, a novel constrained adaptive algorithm based on subband signals called constrained NSAF (CNSAF) is derived by the Lagrange multiplier method, which converges fast with colored inputs and has low computational complexity.

  2. (2)

    The statistical behavior of the proposed CNSAF algorithm, including mean and mean-square stability, transient and steady-state MSD performance, are analyzed, and transient and steady-state MSD prediction models for the CNSAF algorithm are derived.

  3. (3)

    To efficiently identify sparse systems, the L1 norm constraint on the filter weight vector is introduced into the constrained optimization problem solved by the CNSAF algorithm, and a sparse version of the CNSAF algorithm (S-CNSAF) is obtained.

  4. (4)

    Through simulation experiments of system identification, the accuracy of the theoretical MSD analysis results and the superiority of the proposed CNSAF algorithms over other existing constrained adaptive algorithms are verified.

The remainder of this paper is organized as follows: In Sect. 2, the proposed CNSAF algorithm is derived. The mean and mean square stability, theoretical transient and steady state MSD performance of the CNSAF algorithm are analyzed in Sect. 3. Section 4 presents a sparse version of the CNSAF algorithm for sparse systems. In Sect. 5, computer simulations on system identification are provided. Section 6 draws conclusions.

2 The Proposed CNSAF Algorithm

Consider a linear phase system identification application, where the desired signal of the CAF is the output of an unknown linear system excited by some broadband signal, denoted as

$$ d(k) = {\varvec{u}}^{T} (k){\mathbf{w}}_{0} + \upsilon (k) $$
(1)

where \({\mathbf{w}}_{0}\) is the system vector of length L to be estimated, \(\upsilon (k)\) represents the system noise with zero-mean and variance of \(\sigma_{\upsilon }^{2}\), k is the discrete time index, and \({\varvec{u}}(k)\) is the input signal vector, containing current and past input signal samples \([u(k),...,u(k - L + 1)]^{T}\), and \([ \cdot ]^{T}\) is the transpose operation. In most cases, the input is a white noise signal, but there are still many cases of colored signals [9, 22]. Such as the first-order auto-regressive (AR(1)) signal and the second-order auto-regressive (AR(1)) signal.

To identify the unknown linear phase system \({\mathbf{w}}_{0}\), the filter is constrained by (2) to preserve the linear phase for each iteration [10].

$$ {\varvec{H}}^{T} {\mathbf{w}} = {\varvec{m}} $$
(2)

where \({\varvec{H}}\) is a \(L \times M_{{}} (0 \le M \le L)\) constraint matrix, \({\mathbf{w}}\) is the filter weight vector and \({\varvec{m}}\) is the response vector of length M.

To accelerate the convergence of the CLMS algorithm, the correlation of the input signal must be reduced, i.e., the spectral dynamic range (or eigenvalue spread) of the input signal must be reduced. An approach to reduce the spectral dynamic range of the signal is to decompose the signal into subband domains. Therefore, we propose a new constrained optimization problem based on the multi-band structured NSAF method shown in Fig. 1 as follows

$$ \min E\left\{ {\sum\limits_{i = 1}^{N} {\frac{{\left( {d_{i,D}^{{}} (t) - {\varvec{u}}_{i}^{T} (t){\mathbf{w}}} \right)^{2} }}{{2\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} } \right\}\begin{array}{*{20}c} {} \\ \end{array} {\text{s.t}}\begin{array}{*{20}c} {} \\ \end{array} {\varvec{H}}^{T} {\mathbf{w}} = {\varvec{m}} $$
(3)

where \(\left\| \cdot \right\|\) denotes the L2-norm. Figure 1 illustrates the principle of the NSAF method. At first, the original signals \(u(k)\) and \(d(k)\) are passed through the analysis filter bank \(\left\{ {H_{i} (z),i \in 1, \ldots ,N} \right\}\) to generate N subband signal pair \({\text{\{ }}u_{i}^{{}} (k),d_{i}^{{}} (k)\}\). Then, the subband desired signal \(d_{i}^{{}} (k)\) is strictly decimated to obtain \(d_{i,D}^{{}} (t)\), and feeding \(u_{i}^{{}} (k)\) to the filter \({\mathbf{w}}(t)\) yields \(y_{i}^{{}} (k) = {\varvec{u}}_{i}^{T} (k){\mathbf{w}}(t)\), where \({\varvec{u}}_{i}^{{}} (k) = [u_{i}^{{}} (k),...,u_{i}^{{}} (k - L + 1)]^{T}\). Further, the decimated subband output signal of the filter is given by \(y_{i,D}^{{}} (t) = y_{i}^{{}} (tN) = {\varvec{u}}_{i}^{T} (t){\mathbf{w}}(t)\), where \({\varvec{u}}_{i}^{{}} (t) = [u_{i}^{{}} (tN),...,u_{i}^{{}} (tN - L + 1)]^{T}\) is the decimated subband input signal vector. The indexes t and k stand for the decimated subband signal sequence and original signal sequence, respectively.

Fig. 1
figure 1

Schematic diagram of the NSAF algorithm with multiband structure

Based on the above defined quantities, the subband estimation error is obtained as

$$ e_{i,D}^{{}} (t) = d_{i,D}^{{}} (t) - y_{i,D}^{{}} (t) = d_{i,D}^{{}} (t) - {\varvec{u}}_{i}^{T} (t){\mathbf{w}}(t) $$
(4)

Using the Lagrange multiplier method, (3) becomes

$$ \min J({\mathbf{w}}(t)) = E\left\{ {\sum\limits_{i = 1}^{N} {\frac{{e_{i,D}^{2} (t)}}{{2\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} } \right\} + {\varvec{\gamma}}^{T} \left( {{\varvec{m}} - {\varvec{H}}^{T} {\mathbf{w}}(t)} \right) $$
(5)

where \({\varvec{\gamma}}\) is the Lagrange multiplier vector. Taking the gradient of \(J({\mathbf{w}}(t))\) with respect to \({\mathbf{w}}(t)\) yields

$$ \nabla J({\mathbf{w}}(t)) = - E\left\{ {\sum\limits_{i = 1}^{N} {\frac{{e_{i,D}^{{}} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} {\varvec{u}}_{i}^{{}} (t)} \right\} - \user2{H\gamma } $$
(6)

Utilizing the stochastic gradient descent (SGD) method, the weight vector of the CNSAF algorithm is updated by

$$ {\mathbf{w}}(t + 1) = {\mathbf{w}}(t) + \mu \sum\limits_{i = 1}^{N} {\frac{{e_{i,D}^{{}} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} {\varvec{u}}_{i}^{{}} (t) + \mu \user2{H\gamma } $$
(7)

Substituting (7) into the constraint condition in (3), the Lagrange multiplier vector \({\varvec{\gamma}}\) is derived as

$$ {\varvec{\gamma}} = \frac{1}{\mu }\left( {{\varvec{H}}^{T} {\varvec{H}}} \right)^{ - 1} {\varvec{m}} - \frac{1}{\mu }\left( {{\varvec{H}}^{T} {\varvec{H}}} \right)^{ - 1} {\varvec{H}}^{T} {\mathbf{w}}(t) - \left( {{\varvec{H}}^{T} {\varvec{H}}} \right)^{ - 1} {\varvec{H}}^{T} \sum\limits_{i = 1}^{N} {\frac{{e_{i,D}^{{}} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} {\varvec{u}}_{i}^{{}} (t) $$
(8)

Substituting (8) into (7), the weight update of the CNSAF algorithm is described as

$$ {\mathbf{w}}(t + 1) = {\varvec{C}}\left[ {{\mathbf{w}}(t) + \mu \sum\limits_{i = 1}^{N} {\frac{{e_{i,D}^{{}} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} {\varvec{u}}_{i}^{{}} (t)} \right]{\mathbf{ + }}\user2{f } $$
(9)

where

$$ {\varvec{C}} = {\varvec{I}} - {\varvec{H}}({\varvec{H}}^{T} {\varvec{H}})^{ - 1} {\varvec{H}}^{T} $$
(10a)
$$ {\varvec{f}} = {\varvec{H}}({\varvec{H}}^{T} {\varvec{H}})^{ - 1} {\varvec{m}} $$
(10b)

3 Performance Analysis

3.1 Optimal Solution

Letting the gradient vector \(\nabla J({\mathbf{w}}(t))\) in (6) be equal to zero, we have

$$ E\left\{ {\sum\limits_{i = 1}^{N} {\frac{{d_{i,D}^{{}} (t) - {\varvec{u}}_{i}^{T} (t){\mathbf{w}}_{opt} }}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} \cdot {\varvec{u}}_{i}^{{}} (t)} \right\} + \user2{H\gamma } = 0 $$
(11)

After some simple calculations, the optimal solution of the constrained problem (3) is given by

$$ {\mathbf{w}}_{opt} = {\varvec{R}}_{{\varvec{u}}}^{ - 1} {\varvec{p}}_{{\varvec{u}}} + {\varvec{R}}_{{\varvec{u}}}^{ - 1} \user2{H\gamma } $$
(12)

where \({\varvec{p}}_{{\varvec{u}}} = E\left\{ {\sum\limits_{i = 1}^{N} {\frac{{{\varvec{u}}_{i}^{{}} (t)d_{i,D}^{{}} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} } \right\}\) and \({\varvec{R}}_{{\varvec{u}}} = E\left\{ {\sum\limits_{i = 1}^{N} {\frac{{{\varvec{u}}_{i}^{{}} (t){\varvec{u}}_{i}^{T} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} } \right\}\). Substituting (12) into the constraint equation \({\varvec{H}}^{T} {\mathbf{w}}_{opt} = {\varvec{m}}\) leads to

$$ {\varvec{\gamma}} = \left[ {{\varvec{H}}^{T} {\varvec{R}}_{{\varvec{u}}}^{ - 1} {\varvec{H}}} \right]^{ - 1} \cdot \left[ {{\varvec{m}} - {\varvec{H}}^{T} {\varvec{R}}_{{\varvec{u}}}^{ - 1} {\varvec{p}}_{{\varvec{u}}} } \right] $$
(13)

Combining (12) and (13), the optimal solution can be further expressed as

$$ {\mathbf{w}}_{opt} = {\varvec{R}}_{{\varvec{u}}}^{ - 1} {\varvec{p}}_{{\varvec{u}}} + {\varvec{R}}_{{\varvec{u}}}^{ - 1} {\varvec{H}}\left[ {{\varvec{H}}^{T} {\varvec{R}}_{{\varvec{u}}}^{ - 1} {\varvec{H}}} \right]^{ - 1} \cdot \left[ {{\varvec{m}} - {\varvec{H}}^{T} {\varvec{R}}_{{\varvec{u}}}^{ - 1} {\varvec{p}}_{{\varvec{u}}} } \right] $$
(14)

3.2 Mean Stability

To facilitate the analysis, the below two assumptions are provided.

Assumption 1:

The input signal vector \({\varvec{u}}_{i}^{{}} (t)\) is zero-mean with positive-definite covariance matrix \(E\left\{ {{\varvec{u}}_{i}^{{}} (t){\varvec{u}}_{i}^{T} (t)} \right\}\).

Assumption 2:

The input signal vector \({\varvec{u}}_{i}^{{}} (t)\) and the weight error vector \({\tilde{\mathbf{w}}}(t)\) are independent of each other, and the system noise \(\upsilon (k)\) is statistically independent of any other signals.

Defining the weight error vector \({\tilde{\mathbf{w}}}(t)\) as

$$ {\tilde{\mathbf{w}}}(t) = {\mathbf{w}}(t) - {\mathbf{w}}_{opt} $$
(15)

Subtracting \({\mathbf{w}}_{opt}\) from (9) yields

$$ {\tilde{\mathbf{w}}}(t + 1) = {\varvec{C}}\left[ {{\mathbf{w}}(t) + \mu \sum\limits_{i = 1}^{N} {\frac{{e_{i,D}^{{}} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} {\varvec{u}}_{i}^{{}} (t)} \right]{\mathbf{ + }}{\varvec{f}} - {\mathbf{w}}_{opt} $$
(16)

Since \({\varvec{C}}{\mathbf{w}}_{opt} - {\mathbf{w}}_{opt} + \user2{f = }{\mathbf{0}}\) [23, 25], (16) becomes

$$ {\tilde{\mathbf{w}}}(t + 1) = {\varvec{C}}\left[ {{\tilde{\mathbf{w}}}(t) + \mu \sum\limits_{i = 1}^{N} {\frac{{e_{i,D}^{{}} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} {\varvec{u}}_{i}^{{}} (t)} \right] $$
(17)

To facilitate the analysis, a new weight error vector is defined as

$$ \Delta {\mathbf{w}} = {\mathbf{w}}_{0} - {\mathbf{w}}_{opt} $$
(18)

According to (15) and (18), the subband estimation error of (4) can be rewritten as

$$ e_{i,D}^{{}} (t) = {\varvec{u}}_{i}^{T} (t)[\Delta {\mathbf{w}} - {\tilde{\mathbf{w}}}(t)] + \upsilon_{i,D} (t) $$
(19)

where \(\upsilon_{i,D} (t)\) represents the i-th decimated subband system output noise.

Inserting (19) into (17) yields

$$ {\tilde{\mathbf{w}}}(t + 1) = {\varvec{C}}\left[ {{\mathbf{I}} - \mu \sum\limits_{i = 1}^{N} {\frac{{{\varvec{u}}_{i}^{{}} (t){\varvec{u}}_{i}^{T} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} } \right]{\tilde{\mathbf{w}}}(t) + \mu {\varvec{C}}\sum\limits_{i = 1}^{N} {\frac{{{\varvec{u}}_{i}^{{}} (t){\varvec{u}}_{i}^{T} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} \Delta {\mathbf{w}} + \mu {\varvec{C}}\sum\limits_{i = 1}^{N} {\frac{{\upsilon_{i,D} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} {\varvec{u}}_{i}^{{}} (t) $$
(20)

Using the relation \({\varvec{C}}^{2} = {\varvec{C}}\) [25] for (20), we have \({\varvec{C}}{\tilde{\mathbf{w}}}(t) = {\tilde{\mathbf{w}}}(t)\). Taking the expectations on both sides of (20) results in

$$ E\{ {\tilde{\mathbf{w}}}(t + 1)\} = \left[ {{\mathbf{I}} - \mu \sum\limits_{i = 1}^{N} {{\varvec{A}}_{i}^{{}} } } \right]E\{ {\tilde{\mathbf{w}}}(t)\} + \mu \sum\limits_{i = 1}^{N} {{\varvec{A}}_{i}^{{}} } \Delta {\mathbf{w}} $$
(21)

where \({\varvec{A}}_{i}^{{}} \user2{ = C}E\{ {\varvec{B}}_{i}^{{}} (t)\} = {\varvec{CR}}_{i}\), and \({\mathbf{B}}_{i} {(}t{) = }{{{\varvec{u}}_{i}^{{}} (t){\varvec{u}}_{i}^{T} (t)} \mathord{\left/ {\vphantom {{{\varvec{u}}_{i}^{{}} (t){\varvec{u}}_{i}^{T} (t)} {\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} \right. \kern-0pt} {\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}\).

Let \(\lambda_{j} {(}{\varvec{A}}_{i} )\) denotes the j-th eigenvalue of \({\varvec{A}}_{i}\). Then the mean stability of the CNSAF algorithm can be guaranteed if.

$$ \left| {1 - \mu \sum\limits_{i = 1}^{N} {\lambda_{j} {(}{\varvec{A}}_{i}^{{}} {)}} } \right| < 1,{\text{ for}}\,j = 1,...,L $$
(22)

holds, which is equivalent to

$$ 0 < \mu < \frac{2}{{\sum\limits_{i = 1}^{N} {\lambda_{\max } {(}{\varvec{A}}_{i} )} }} $$
(23)

where \(\lambda_{\max } {(}{\varvec{A}}_{i} )\) denotes the maximum eigenvalue of \({\varvec{A}}_{i}\).

When the CNSAF algorithm arrives at steady-state, i.e., \(\mathop {\lim }\limits_{t \to \infty } E\left\{ {{\tilde{\mathbf{w}}}(t)} \right\} \approx E\left\{ {{\tilde{\mathbf{w}}}(t + 1)} \right\}\), we can obtain from (21) that

$$ E\{ {\tilde{\mathbf{w}}}(\infty )\} = \Delta {\mathbf{w}} $$
(24)

In other words, the CNSAF algorithm is asymptotically unbiased.

3.3 Mean Square Stability

Using the relation \({\varvec{C}}{\tilde{\mathbf{w}}}(t) = {\tilde{\mathbf{w}}}(t)\), we can rewrite (20) as

$$ {\tilde{\mathbf{w}}}(t + 1) = \left[ {{\mathbf{I}} - \mu \sum\limits_{i = 1}^{N} {{\varvec{CB}}_{i}^{{}} (t)} {\varvec{C}}} \right]{\tilde{\mathbf{w}}}(t) + \mu {\varvec{C}}\left( {\sum\limits_{i = 1}^{N} {{\varvec{B}}_{i}^{{}} (t)} \Delta {\mathbf{w}} + \sum\limits_{i = 1}^{N} {\upsilon_{i,D} (t){\mathbf{q}}_{i}^{{}} (t)} } \right) $$
(25)

where \({\mathbf{q}}_{i}^{{}} (t) = {{{\varvec{u}}_{i}^{{}} (t)} \mathord{\left/ {\vphantom {{{\varvec{u}}_{i}^{{}} (t)} {\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} \right. \kern-0pt} {\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}\). Taking the expectations on the squared Euclidean norm of (25) under Assumptions 1 and 2 results in

$$ \begin{gathered} E\left\{ {\left\| {{\tilde{\mathbf{w}}}(t + 1)} \right\|_{{}}^{2} } \right\} = E\left\{ {\left\| {{\tilde{\mathbf{w}}}(t)} \right\|_{{\varvec{\varPi}}}^{2} } \right\} + \mu^{2} \Delta {\mathbf{w}}^{T} E\left\{ {\left( {\sum\limits_{i = 1}^{N} {{\varvec{B}}_{i}^{{}} (t)} } \right)^{T} {\varvec{C}}^{T} {\varvec{C}}\sum\limits_{i = 1}^{N} {{\varvec{B}}_{i}^{{}} (t)} } \right\}\Delta {\mathbf{w}} \\ + \mu^{2} E\left\{ {\left( {\sum\limits_{i = 1}^{N} {\upsilon_{i,D} (t)} {\varvec{q}}_{i}^{{}} (t)} \right)^{T} {\varvec{C}}^{T} {\varvec{C}}\sum\limits_{i = 1}^{N} {\upsilon_{i,D} (t)} {\varvec{q}}_{i}^{{}} (t)} \right\} \\ \end{gathered} $$
(26)

where

$${\varvec{\varPi}}= {\mathbf{I}} - 2\mu \sum\limits_{i = 1}^{N} {{\varvec{CR}}_{i}^{{}} } {\varvec{C}} + \mu^{2} {\varvec{C}}\left[ {\sum\limits_{i,j = 1}^{N} {\underbrace {{E\left\{ {\frac{{{\varvec{u}}_{i}^{{}} (t){\varvec{u}}_{i}^{T} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}{\varvec{C}}\frac{{{\varvec{u}}_{j}^{{}} (t){\varvec{u}}_{j}^{T} (t)}}{{\left\| {{\varvec{u}}_{j}^{{}} (t)} \right\|^{2} }}} \right\}}}_{(a)}} } \right]{\varvec{C}} $$
(27)

Assuming that the subband input signals are orthogonal at zero lag [18], the term (a) in (27) can be further rewritten as

$$ E\left\{ {\frac{{{\varvec{u}}_{i}^{{}} (t){\varvec{u}}_{i}^{T} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}{\varvec{C}}\frac{{{\varvec{u}}_{j}^{{}} (t){\varvec{u}}_{j}^{T} (t)}}{{\left\| {{\varvec{u}}_{j}^{{}} (t)} \right\|^{2} }}} \right\}\user2{ = }\left\{ {\begin{array}{*{20}c} {{\mathbf{0}},} & {i \ne j} \\ {E\left\{ {\frac{{{\varvec{u}}_{i}^{{}} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}{\varvec{u}}_{i}^{T} (t){\varvec{Cu}}_{i}^{{}} (t)\frac{{{\varvec{u}}_{i}^{T} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} \right\},} & {i = j} \\ \end{array} } \right. $$
(28)

Utilizing the Isserlis’ theorem [16], one achieves

$$ E\left\{ {\frac{{{\varvec{u}}_{i}^{{}} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}{\varvec{u}}_{i}^{T} (t){\varvec{Cu}}_{i}^{{}} (t)\frac{{{\varvec{u}}_{i}^{T} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} \right\} = 2{\varvec{R}}_{i}^{{}} {\varvec{CR}}_{i}^{{}} + {\varvec{R}}_{i}^{{}} tr\left[ {{\varvec{CR}}_{i}^{{}} {\varvec{C}}} \right] $$
(29)

where \(tr[ \cdot ]\) denotes the trace operation of matrix. Then, (27) can be written more concisely as

$${\varvec{\varPi}}= {\mathbf{I}} - 2\mu \sum\limits_{i = 1}^{N} {{\varvec{Q}}_{i}^{{}} } + \mu^{2} \left[ {\sum\limits_{i = 1}^{N} {\left( {2{\varvec{Q}}_{i}^{2} + {\varvec{Q}}_{i}^{{}} tr[{\varvec{Q}}_{i}^{{}} ]} \right)} } \right] $$
(30)

where \({\varvec{Q}}_{i}^{{}} \user2{ = CR}_{i}^{{}} {\varvec{C}}\). Let \(\lambda_{m} ({\varvec{Q}}_{i}^{{}} ),m = 1,2, \ldots ,L\) denote the eigenvalues of \({\varvec{Q}}_{i}^{{}}\). Then the eigenvalues of \({\varvec{\varPi}}\) can be obtained as

$$ \kappa_{m} = 1 - 2\mu \sum\limits_{i = 1}^{N} {\lambda_{m} ({\varvec{Q}}_{i}^{{}} )} + \mu^{2} \left[ {\sum\limits_{i = 1}^{N} {\left( {2\lambda_{m}^{2} ({\varvec{Q}}_{i}^{{}} ) + \lambda_{m} ({\varvec{Q}}_{i}^{{}} )tr[{\varvec{Q}}_{i}^{{}} ]} \right)} } \right] $$
(31)

According to (30), the inequality

$$ \min \left\{ {\kappa_{m} } \right\}{\mathbf{I}} <{\varvec{\varPi}}< \max \left\{ {\kappa_{m} } \right\}{\mathbf{I}} $$
(32)

is established. Let \(\lambda_{a} ({\varvec{Q}}_{i}^{{}} )\) and \(\lambda_{b} ({\varvec{Q}}_{i}^{{}} )\) represent the eigenvalues of \({\varvec{Q}}_{i}^{{}}\) corresponding to \(\min \left\{ {\kappa_{m} } \right\}\) and \(0 < \min \left\{ {\kappa_{m} } \right\} < \max \left\{ {\kappa_{m} } \right\} < 1\), respectively, i.e.,

$$ \min \left\{ {\kappa_{m} } \right\} = 1 - 2\mu \sum\limits_{i = 1}^{N} {\lambda_{a} ({\varvec{Q}}_{i}^{{}} )} + \mu^{2} \left[ {\sum\limits_{i = 1}^{N} {\left( {2\lambda_{a}^{2} ({\varvec{Q}}_{i}^{{}} ) + \lambda_{a} ({\varvec{Q}}_{i}^{{}} )tr[{\varvec{Q}}_{i}^{{}} ]} \right)} } \right] $$
(33a)
$$ \max \left\{ {\kappa_{m} } \right\} = 1 - 2\mu \sum\limits_{i = 1}^{N} {\lambda_{b} ({\varvec{Q}}_{i}^{{}} )} + \mu^{2} \left[ {\sum\limits_{i = 1}^{N} {\left( {2\lambda_{b}^{2} ({\varvec{Q}}_{i}^{{}} ) + \lambda_{b} ({\varvec{Q}}_{i}^{{}} )tr[{\varvec{Q}}_{i}^{{}} ]} \right)} } \right] $$
(33b)

Based on the above analysis, the mean-square stability condition of the CNSAF algorithm can be obtained as

$$ \left| {\max \left\{ {\kappa_{m} } \right\}} \right| < 1 $$
(34)

that is,

$$ \left| {1 - 2\mu \sum\limits_{i = 1}^{N} {\lambda_{b} ({\varvec{Q}}_{i}^{{}} )} + \mu^{2} \left[ {\sum\limits_{i = 1}^{N} {\left( {2\lambda_{b}^{2} ({\varvec{Q}}_{i}^{{}} ) + \lambda_{b} ({\varvec{Q}}_{i}^{{}} )tr[{\varvec{Q}}_{i}^{{}} ]} \right)} } \right]} \right| < 1 $$
(35)

Finally, the mean-square stability condition for the CNSAF algorithm regarding the step size is given by

$$ 0 < \mu < \frac{{2\sum\limits_{i = 1}^{N} {\lambda_{b} ({\varvec{Q}}_{i}^{{}} )} }}{{\sum\limits_{i = 1}^{N} {\left[ {2\lambda_{b}^{2} ({\varvec{Q}}_{i}^{{}} ) + \lambda_{b} ({\varvec{Q}}_{i}^{{}} )tr[{\varvec{Q}}_{i}^{{}} ]} \right]} }} $$
(36)

3.4 Transient and Steady-State MSD Analysis

Define the auto-correlation matrix \({\tilde{\mathbf{W}}}(t)\) of the weight error vector \({\tilde{\mathbf{w}}}(t)\) as

$$ {\tilde{\mathbf{W}}}(t) = E[{\tilde{\mathbf{w}}}(t){\tilde{\mathbf{w}}}^{T} (t)] $$
(37)

According to (20) and (37), we have

$$ \begin{gathered} {\tilde{\mathbf{W}}}(t + 1) = E[{\tilde{\mathbf{w}}}(t + 1){\tilde{\mathbf{w}}}^{T} (t + 1)] \\ = {\varvec{C}}E\left\{ {\left[ {{\mathbf{I}} - \mu \sum\limits_{i = 1}^{N} {{\varvec{B}}_{i}^{{}} (t)} } \right]{\tilde{\mathbf{W}}}(t)\left[ {{\mathbf{I}} - \mu \sum\limits_{i = 1}^{N} {{\varvec{B}}_{i}^{{}} (t)} } \right]^{T} {\varvec{C}}^{T} } \right\} + {\varvec{C}}E\left\{ {\left[ {{\mathbf{I}} - \mu \sum\limits_{i = 1}^{N} {{\varvec{B}}_{i}^{{}} } (t)} \right]{\tilde{\mathbf{w}}}(t)\Delta {\mathbf{w}}^{T} \mu \sum\limits_{i = 1}^{N} {{\varvec{B}}_{i}^{T} (t)} {\varvec{C}}} \right\} \\ + \mu {\varvec{C}}E\left\{ {\sum\limits_{i = 1}^{N} {{\varvec{B}}_{i}^{{}} (t)\Delta {\mathbf{w}}} {\tilde{\mathbf{w}}}^{T} (t)\left[ {{\mathbf{I}} - \mu \sum\limits_{i = 1}^{N} {{\varvec{B}}_{i}^{{}} } (t)} \right]^{T} {\varvec{C}}^{T} } \right\} + \mu^{2} {\varvec{C}}E\left\{ {\sum\limits_{i = 1}^{N} {{\varvec{B}}_{i}^{{}} (t)\Delta {\mathbf{w}}\Delta {\mathbf{w}}^{T} } \sum\limits_{i = 1}^{N} {{\varvec{B}}_{i}^{T} (t)} {\varvec{C}}} \right\} \\ + \mu^{2} {\varvec{C}}\sum\limits_{i = 1}^{N} {\sigma_{g,i}^{2} E\{ {\mathbf{q}}_{i}^{{}} (t){\mathbf{q}}_{i}^{T} (t)\} {\varvec{C}}} \\ \end{gathered} $$
(38)

Performing the vectorization on both sides of (38) and using the property \({\text{vec(}}{\varvec{X}}_{1} {\varvec{X}}_{2} {\varvec{X}}_{3} ) = ({\varvec{X}}_{3}^{{\text{T}}} \otimes {\varvec{X}}_{1} ){\text{vec(}}{\varvec{X}}_{2} {)}\), (38) becomes

$$ \begin{gathered} {\text{vec}}\left[ {{\tilde{\mathbf{W}}}(t + 1)} \right] = {\varvec{F}} \cdot {\text{vec}}\left[ {{\tilde{\mathbf{W}}}(t)} \right] \\ + \mu \left( {{\varvec{C}} \otimes {\varvec{C}}} \right)\left( {{\varvec{R}}_{{\varvec{u}}}^{{}} \otimes {\varvec{I}}} \right){\text{vec}}\left[ {E\left\{ {{\tilde{\mathbf{w}}}(t)\Delta {\mathbf{w}}^{T} } \right\}} \right] + \mu \left( {{\varvec{C}} \otimes {\varvec{C}}} \right)\left( {{\varvec{I}} \otimes {\varvec{R}}_{{\varvec{u}}}^{{}} } \right){\text{vec}}\left[ {E\left\{ {\Delta {\mathbf{w\tilde{w}}}^{T} (t)} \right\}} \right] \\ + \mu^{2} {(}{\varvec{C}} \otimes {\varvec{C}}{\text{)vec}}\left[ {E\left\{ {\sum\limits_{i = 1}^{N} {\sigma_{g,i}^{2} } {\mathbf{q}}_{i}^{{}} (t){\mathbf{q}}_{i}^{T} (t)} \right\}} \right] + \mu^{2} \left( {{\varvec{C}} \otimes {\varvec{C}}} \right)E\left\{ {{\varvec{T}}(t)} \right\} \cdot \left( {{\text{vec}}\left[ {E\left\{ {\Delta {\mathbf{w}}\Delta {\mathbf{w}}^{T} } \right\}} \right]} \right. \\ - {\text{vec}}\left[ {E\left\{ {\Delta {\mathbf{w\tilde{w}}}^{T} (t)} \right\}} \right] - \left. {{\text{vec}}\left[ {E\left\{ {{\tilde{\mathbf{w}}}(t)\Delta {\mathbf{w}}^{T} } \right\}} \right]} \right) \\ \end{gathered} $$
(39)

where \({\varvec{T}}(t) = \sum\limits_{i = 1}^{N} {{\varvec{B}}_{i}^{{}} (t)} \otimes \sum\limits_{i = 1}^{N} {{\varvec{B}}_{i}^{{}} (t)}\), and

$$ {\varvec{F}} = \left( {{\varvec{C}} \otimes {\varvec{C}}} \right) \cdot \left[ {{\mathbf{I}}_{{L^{2} }} } \right. - \mu \left( {{\mathbf{I}}_{L} \otimes {\varvec{R}}_{{\varvec{u}}}^{{}} } \right) - \mu \left( {{\varvec{R}}_{{\varvec{u}}}^{{}} \otimes {\mathbf{I}}_{L} } \right) + \left. {\mu^{2} E\left\{ {{\varvec{T}}(t)} \right\}} \right] $$
(40)

According to \(tr\left[ {{\varvec{X}}_{a} {\varvec{X}}_{b} } \right] = {\text{vec}}\left( {{\varvec{X}}_{a}^{T} } \right)^{T} {\text{vec}}\left( {{\varvec{X}}_{b} } \right)\), the transient mean-square deviation (MSD) at time t + 1 is given by

$$ {\text{MSD(}}t + 1) = {\text{vec}}\left( {{\varvec{I}}_{L} } \right)^{T} {\text{vec}}\left[ {{\tilde{\mathbf{W}}}(t + 1)} \right] $$
(41)

When the CNSAF algorithm converges to the steady state, the approximation \(\mathop {\lim }\limits_{t \to \infty } {\text{vec}}\left[ {{\tilde{\mathbf{W}}}(t)} \right] \approx {\text{vec}}\left[ {{\tilde{\mathbf{W}}}(t + 1)} \right]\) holds. Using the unbiasedness (24), the steady-state MSD of the CNSAF algorithm can be obtained from (39) as

$$ \begin{gathered} {\text{MSD(}}\infty ) = {\text{vec}}({\varvec{I}}_{L} )^{T} \left( {{\varvec{I}}_{{L^{2} }} - {\mathbf{F}}} \right)^{ - 1} \\ \cdot \left[ {\left( {\mu \left( {{\varvec{CR}}_{{\varvec{u}}}^{{}} \otimes {\varvec{C}}} \right) + \mu \left( {{\varvec{C}} \otimes {\varvec{CR}}_{{\varvec{u}}}^{{}} } \right) - \mu^{2} \left( {{\varvec{C}} \otimes {\varvec{C}}} \right)E\left\{ {{\varvec{T}}(t)} \right\}} \right){\text{vec}}\left[ {E\left\{ {\Delta {\mathbf{w}}\Delta {\mathbf{w}}^{T} } \right\}} \right]} \right. \\ \left. { + \mu^{2} \left( {{\varvec{C}} \otimes {\varvec{C}}} \right){\text{vec}}\left[ {E\left\{ {\sum\limits_{i = 1}^{N} {\sigma_{g,i}^{2} } {\mathbf{q}}_{i}^{{}} (n){\mathbf{q}}_{i}^{T} (n)} \right\}} \right]} \right] \\ \end{gathered} $$
(42)

3.5 Computational Complexity

Table 1 summarize the computational complexities of the CLMS, CAPA, CAPL, and the proposed CNSAF in terms of multiplication, addition, and DMI operations. \(L_{d}\) is the length of the analysis filter bank, K represents the projection order of the CAPA and CAPL algorithms, and it can be observed from Table 1 that the CAPL algorithm has no DMI operation. In addition, the computational complexity of the proposed CNSAF algorithm is much lower than those of CAPA and CAPL algorithms, and is close to that of the CLMS algorithm.

Table 1 Computational complexities of the CLMS, CAPA, CAPL, and CNSAF algorithms

4 The SPARSE version of the CNSAF Algorithm

Considering that the system to be estimated may be sparse, the L1 norm of the filter weight vector is introduced as an additional constraint in the constrained optimization problem (3), which yields

$$ \min E\left\{ {\sum\limits_{i = 1}^{N} {\frac{{\left( {d_{i,D}^{{}} (t) - {\varvec{u}}_{i}^{T} (t){\mathbf{w}}} \right)^{2} }}{{2\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} } \right\}\begin{array}{*{20}c} {} \\ \end{array} {\text{s.t}}\begin{array}{*{20}c} {} \\ \end{array} \left\{ {\begin{array}{*{20}c} {{\varvec{H}}^{T} {\mathbf{w}} = {\varvec{m}}} \\ {\left\| {\mathbf{w}} \right\|_{1} = \eta } \\ \end{array} } \right. $$
(43)

where \(\eta\) is the L1 norm of the coefficient vector \({\mathbf{w}}\), \(\left\| {\mathbf{w}} \right\|_{1} = {\text{sign}}^{T} \left[ {\mathbf{w}} \right]{\mathbf{w}}\) and \({\text{sign}}\left[ w \right] \triangleq \frac{w}{\left| w \right|}\).

Using the Lagrange multiplier method, the constrained minimization problem (43) can be transformed into

$$ \mathop {\min }\limits_{{\mathbf{w}}} J({\mathbf{w}}) = E\left\{ {\sum\limits_{i = 1}^{N} {\frac{{\left( {d_{i,D}^{{}} (t) - {\varvec{u}}_{i}^{T} (t){\mathbf{w}}} \right)^{2} }}{{2\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} } \right\} + {\varvec{\gamma}}_{1}^{T} \left( {{\varvec{m}} - {\varvec{H}}^{T} {\mathbf{w}}} \right) + \gamma_{2}^{{}} \left( {\left\| {\mathbf{w}} \right\|_{1} - \eta } \right) $$
(44)

Calculating the stochastic gradient of the cost function \(J({\mathbf{w}})\) with respect to \({\mathbf{w}}\) yields

$$ \nabla J({\mathbf{w}}) = - E\left\{ {\sum\limits_{i = 1}^{N} {\frac{{e_{i,D}^{{}} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} {\varvec{u}}_{i}^{{}} (t)} \right\} - \user2{H\gamma }_{1} + \gamma_{2}^{{}} {\text{sign}}\left[ {\mathbf{w}} \right] $$
(45)

Using the SGD method, the weight update of the S-CNSAF algorithm is obtained as

$$ {\mathbf{w}}(t + 1) = {\mathbf{w}}(t) + \mu \sum\limits_{i = 1}^{N} {\frac{{e_{i,D}^{{}} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} {\varvec{u}}_{i}^{{}} (t) + \mu \user2{H\gamma }_{1} \user2{ - }\mu \gamma_{2}^{{}} {\text{sign}}\left[ {{\mathbf{w}}(t)} \right] $$
(46)

According to the relation \({\varvec{H}}^{T} {\mathbf{w}}(t + 1) = {\varvec{m}}\), the Lagrange multiplier vector \({\varvec{\gamma}}_{1} (t)\) is obtained by pre-multiplying (46) by \({\varvec{H}}^{T}\) as

$$ {\varvec{\gamma}}_{1} (t) = \frac{1}{\mu }({\varvec{H}}^{T} {\varvec{H}}{)}^{ - 1} {\varvec{m}} - \frac{1}{\mu }({\varvec{H}}^{T} {\varvec{H}}{)}^{ - 1} {\varvec{H}}^{T} {\mathbf{w}}(t) - ({\varvec{H}}^{T} {\varvec{H}}{)}^{ - 1} {\varvec{H}}^{T} \cdot \left[ {\sum\limits_{i = 1}^{N} {\frac{{e_{i,D}^{{}} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} {\varvec{u}}_{i}^{{}} (t) - \gamma_{2} (t){\text{sign}}\left[ {{\mathbf{w}}(t)} \right]} \right] $$
(47)

Then (46) is pre-multiplied by \({\text{sign}}^{T} \left[ {{\mathbf{w}}(t)} \right]\) to get

$$ \eta = \eta (t) + \mu \left[ {{\text{sign}}^{T} \left[ {{\mathbf{w}}(t)} \right]\sum\limits_{i = 1}^{N} {\frac{{e_{i,D}^{{}} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} {\varvec{u}}_{i}^{{}} (t) + {\text{sign}}^{T} \left[ {{\mathbf{w}}(t)} \right]\user2{H\gamma }_{1} (t) - \gamma_{2} (t){\text{sign}}^{T} \left[ {{\mathbf{w}}(t)} \right]{\text{sign}}\left[ {{\mathbf{w}}(t)} \right]} \right] $$
(48)

where \(\eta = {\text{sign}}^{T} \left[ {{\mathbf{w}}(t)} \right]{\mathbf{w}}(t + 1)\) and \(\eta (t) = {\text{sign}}^{T} \left[ {{\mathbf{w}}(t)} \right]{\mathbf{w}}(t)\) [2]. After separating \(\gamma_{2} (t)\) from (48), one gets

$$ \gamma_{2} (t) = \frac{1}{L}\left[ { - \frac{1}{\mu }e_{{L_{1} }} (t)} \right. + {\text{sign}}^{T} \left[ {{\mathbf{w}}(t)} \right]\sum\limits_{i = 1}^{N} {\frac{{e_{i,D}^{{}} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} {\varvec{u}}_{i}^{{}} (t)\left. { + {\text{sign}}^{T} \left[ {{\mathbf{w}}(t)} \right]\user2{H\gamma }_{1} (t)} \right] $$
(49)

where \(e_{{L_{1} }} (t) = \eta - \eta (t)\). By solving the system of equations stated by (47) and (49), the Lagrange multipliers \({\varvec{\gamma}}_{1} (t)\) and \(\gamma_{2} (t)\) can be derived [2]. The sparse version of the CNSAF algorithm is finally obtained, i.e.,

$$ {\mathbf{w}}(t + 1) = {\varvec{C}}\left[ {{\mathbf{w}}(t) + \mu \overline{\user2{C}}\sum\limits_{i = 1}^{N} {\frac{{e_{i,D}^{{}} (t)}}{{\left\| {{\varvec{u}}_{i}^{{}} (t)} \right\|^{2} }}} {\varvec{u}}_{i}^{{}} (t)} \right]{\mathbf{ + }}\user2{f + f}_{{L_{1} }} (t) $$
(50)

where \({\varvec{f}}_{{L_{1} }} (t) = e_{{L_{1} }} (t){{{\varvec{C}}{\text{sign}}\left[ {{\mathbf{w}}(t)} \right]} \mathord{\left/ {\vphantom {{{\varvec{C}}{\text{sign}}\left[ {{\mathbf{w}}(t)} \right]} {\left\| {{\varvec{C}}{\text{sign}}\left[ {{\mathbf{w}}(t)} \right]} \right\|_{2}^{2} }}} \right. \kern-0pt} {\left\| {{\varvec{C}}{\text{sign}}\left[ {{\mathbf{w}}(t)} \right]} \right\|_{2}^{2} }}\), and \(\overline{\user2{C}} = \left[ {{\mathbf{I}}_{L} - {{{\varvec{C}}{\text{sign}}\left[ {{\mathbf{w}}(t)} \right]} \mathord{\left/ {\vphantom {{{\varvec{C}}{\text{sign}}\left[ {{\mathbf{w}}(t)} \right]} {\left\| {{\varvec{C}}{\text{sign}}\left[ {{\mathbf{w}}(t)} \right]} \right\|_{2}^{2} }}} \right. \kern-0pt} {\left\| {{\varvec{C}}{\text{sign}}\left[ {{\mathbf{w}}(t)} \right]} \right\|_{2}^{2} }}{\text{sign}}^{T} \left[ {{\mathbf{w}}(t)} \right]} \right]{\varvec{C}}\).

5 Simulation Results

In this section, the superiority of the proposed CNSAF algorithm over other competitors for colored inputs and the accuracy of theoretical MSD analysis results are demonstrated via extensive computer simulations. The system output noise is a zero-mean WGN with variance \(\sigma_{\upsilon }^{2} = 0.001\) for all simulations. The cosine modulated analysis filter bank is utilized for the CNSAF algorithm. The MSD defined as \(10\log 10E\left[ {\left\| {\tilde{\user2{w}}(t)} \right\|^{2} } \right]\) in dB is used to measure the algorithm’s performance. All simulation results are obtained by averaging over 100 independent runs.

5.1 System Identification

In this subsection, the convergence performance of the CLMS, CAPA, CAPL and CNSAF algorithms is evaluated in the context of system identification. The system parameter vector \({\varvec{w}}_{0}\) of length L = 7 and the constraint matrix \({\varvec{H}}\) of size \(7 \times 3\) are randomly generated, where \({\varvec{w}}_{0}\) has unit energy, \({\varvec{H}}\) is full-rank, and the constraint vector is given by \(\user2{m = H}^{T} {\varvec{w}}_{0}\). We consider two different types of colored input signals, one is the AR(1) signal, i.e., \(x(k) = 0.99x(k - 1) + z(k)\), denoted simply as \(AR1(1, - 0.99)\). The other is an AR(2) input signal, i.e., \(x(k) = - 1.9x(k - 1) - 0.95x(k - 2) + z(k)\), denoted simply as \(AR2(1,1.9,0.95)\), where \(z(k)\) is a white Gaussian noise with unit variance. In order to obtain a fair comparison result, the step sizes for all algorithms are chosen so that the steady state MSDs of the algorithms are the same. As shown in Fig. 2a and b, the convergence speed of the CLMS algorithm is severely degraded under colored input signals. Although the CAPL algorithm converges slower than CAPA, it has lower computational complexity. From Fig. 2a, it can be seen that CNSAF converges at about 500 iterations, while the CAPA takes about 1000 iterations to converge. In Fig. 2b, CNSAF is faster than the CAPA by about 200 number of iterations. The reason for the fastest convergence of the proposed CNSAF algorithm is due to the better whitening ability of the NSAF on the colored input signals.

Fig. 2
figure 2

MSD convergence behavior for all algorithms in the identification of linear system under AR input signals. a \(AR1(1, - 0.99)\). b \(AR2(1,1.9,0.95)\)

5.2 Linear Phase System Identification

In this subsection, we consider a linear phase system identification problem. To identify the unknown linear phase system, the filter needs to be constrained to preserve the linear phase for each iteration. To this end, the constraint matrix is chosen as \({\varvec{H}} = [{\varvec{I}}_{{({{L - 1)} \mathord{\left/ {\vphantom {{L - 1)} 2}} \right. \kern-0pt} 2}}} ,{\mathbf{0}}, - {\varvec{V}}_{{({{L - 1)} \mathord{\left/ {\vphantom {{L - 1)} 2}} \right. \kern-0pt} 2}}} ]^{{\text{T}}}\), and \({\varvec{V}}_{{({{L - 1)} \mathord{\left/ {\vphantom {{L - 1)} 2}} \right. \kern-0pt} 2}}}\) denotes the reversal matrix of unit matrix. The response vector of length \(M = (L - {{1)} \mathord{\left/ {\vphantom {{1)} 2}} \right. \kern-0pt} 2}\) is \({\varvec{m}} = {\mathbf{0}}\). The system parameter vector \({\varvec{w}}_{0}\) of length L has linear phase and unit energy and is generated by the Matlab command \({\varvec{w}}_{0} = {{fir1(L - 1,\omega_{\alpha } )} \mathord{\left/ {\vphantom {{fir1(L - 1,\omega_{\alpha } )} {norm}}} \right. \kern-0pt} {norm}}\left[ {fir1(L - 1,\omega_{\alpha } )} \right]\), where \(\omega_{\alpha } = 0.5\) is the cutoff frequency and \(L = 9\). The two AR signals used in Sect. 5.1 are employed in the simulation. The simulation results in Fig. 3 are similar to those in Fig. 2, where the convergence performance of the CLMS algorithm is severely deteriorated, the CAPA converges faster than CAPL, and the CNSAF algorithm converges fastest. In the AR(1) experiment of Fig. 3, CNSAF is faster than the CAPA by about 500 iteration numbers, and CNSAF is faster than the CAPA by about 100 iteration numbers in the AR(2) experiment. Simulation results in both Figs. 2 and 3 confirm the superiority of the CNASF algorithm.

Fig. 3
figure 3

MSD convergence behavior for all algorithms in the identification of linear phase system under AR input signals. a \(AR1(1, - 0.99)\). b \(AR2(1,1.9,0.95)\)

5.3 Sparse System Identification

In this simulation, the performance of the proposed S-CNSAF algorithm is evaluated by sparse system identification. Consider a sparse system with sparsity of 15/512 as shown in Fig. 4. The size of the constraint matrix is \(512 \times 40\), where the elements obey a white Gaussian distribution with zero mean and unit variance, and the response vector \({\varvec{m}}\) is obtained from \({\varvec{m}} = {\varvec{H}}^{T} {\mathbf{w}}_{0}\). The input signals are the same as in Sect. 5.1. From Fig. 5a, it can be seen that the S-CNSAF algorithm has the fastest convergence speed and saves about 2000 iterations compared to the CAPA when the input is an AR(1) signal with the same steady-state estimation accuracy (about − 34.5 dB). With the AR(2) input signal (shown in Fig. 5b), the S-CNSAF and CAPA algorithms have almost the same convergence speed at about the 10000th iteration at a steady-state estimation accuracy of about -30dB. In conclusion, CAPA performs better than CNSAF in sparse system identification, and the proposed S-CNSAF algorithm has even better convergence performance.

Fig. 4
figure 4

Impulse response of the sparse system

Fig. 5
figure 5

MSD convergence behavior for all algorithms in the identification of sparse system under AR input signals. a \(AR1(1, - 0.99)\). b \(AR2(1,1.9,0.95)\)

5.4 Validation of the Theoretical Analysis Results

In this subsection, the accuracy of the theoretical transient and steady state MSD analysis results (40) and (41) are verified. The system parameter vector \({\varvec{w}}_{0}\) is randomly generated from a zero-mean WGN with variance of 0.1. The elements of the \(L \times M\) constraint matrix \({\varvec{H}}\) follow the standard normal distribution with M = 3. The response vector is \({\varvec{m}} = {\varvec{H}}^{T} {\mathbf{w}}_{0}\).

1) Transient MSD performance

In this simulation, \(AR1(1, - 0.8)\) is used as the input signal. The filter order is L = 16. The effect of the step size and the number of subbands on the transient MSD convergence behavior of the CNSAF algorithm is investigated and the results are presented in Fig. 6. It can be seen that the larger the step size, the faster the CNSAF algorithm converges and the higher the steady state MSD and vice versa. When the number of subbands is less than 4, the number of subbands mainly affects the convergence speed of the algorithm and has little effect on the steady-state accuracy. When the number of subbands is greater than 4, the number of subbands merely affects the steady-state estimation accuracy of the CNSAF algorithm. In addition, the theoretical and simulated MSD curves are in good agreement, which fully demonstrates the accuracy of the theoretical transient analysis result.

Fig. 6
figure 6

Transient MSD curves of the CNSAF algorithm for different step sizes and the number of subbands, both theoretical and simulated. a Transient MSD curves for different step sizes. (\(N = 8\)) b Transient MSD curves for different the number of subbands (\(\mu = 0.05\)).

2) Steady-state MSD

In this subsection, the effect of the step size and the number of subbands on the steady-state MSD of the CNSAF algorithm is investigated. The filter order is L = 48 and \(AR1(1, - 0.9)\) is used as the input signal. As shown in Fig. 7, the theoretical and simulated steady-state MSD values are highly coincident, and the accuracy of the theoretical steady-state MSD of the CNSAF algorithm is verified.

Fig. 7
figure 7

Effect of the step size and the number of subbands on the steady-state MSD of the CNSAF algorithm

In summary, the theoretical analysis in Sect. 3 provides valid and accurate theoretical predictive models for the transient and steady-state MSD statistical behavior of the CNSAF algorithm.

6 Conclusion

In this paper, a novel constrained adaptive filtering algorithm named CNSAF was proposed, which effectively overcomes the issue of slow convergence of CLMS algorithm for colored inputs and maintains a low computational complexity close to that of CLMS algorithm. Then, we analyzed the mean and mean-square stability, theoretical transient and steady state MSD behaviors of the CNSAF algorithm and derived the corresponding theoretical prediction models for MSD. To effectively identify sparse systems, a sparse version of the CNSAF algorithm (S-CNSAF) is further derived. System identification computer simulations show that the proposed CNSAF and S-CNSAF algorithms outperform other competing algorithms under colored input signals, and the obtained theoretical MSD prediction models can accurately predict the MSD statistical behavior of the CNSAF algorithm.