1 Introduction

In recent years, wireless communication is the fastest growing and most widely used technology in the field of information communication. As for wireless communication channels, radio wave propagation through the channels will produce a direct wave and a ground reflection wave, which are the most common waves in the propagation process. In addition, there will also be scattered waves caused by various obstacles in the propagation paths [1, 2]. These phenomena eventually result in the multi-path effect [25], which often occurs and is often very serious. Such effect may lead to signal delays and to reduce communication quality. Furthermore, measurements show that the broadband channels are described as sparse multi-path channels, which are very common in the field of wireless communication [6, 7]. The general characteristic of these sparse multi-path channels is that most of the channel impulse response (CIR) coefficients are equal to zero. The very few nonzero coefficients are termed active coefficients [2, 7, 8]. Adaptive filters (AF) have been widely applied to the estimation of sparse channels to improve the quality of wireless communications [912]. The most traditional AF algorithm is the least mean square (LMS), which has been derived to minimize the mean square estimation error [9, 13]. It is widely employed in real-world applications due to its simplicity. However, LMS performance degrades in low signal-to-noise (SNR) situations and in sparse channel estimation. Motivated by compressive sensing principles [14], a series of sparse LMS algorithms have been recently proposed by incorporating l0-norm, l1-norm, or lp-norm penalties into the LMS cost function [1421]. These algorithms have been shown to lead to superior performances than LMS both in steady-state error (SSE) and in convergence speed for sparse multi-path channel estimation. Nevertheless, the performances of such algorithms tend to degrade in non-Gaussian white noise environments (NGWNE), a frequently encountered situation in the real wireless communication systems [2224].

Some recent efforts have also been directed to improve the performance of adaptive filters in the presence of non-Gaussian and impulsive disturbances. One of such efforts was to replace the MSE with information theoretic, entropy-related, cost functions [2528]. At present, most entropy-based AF algorithms have been derived using the maximum correntropy criterion (MCC) and the minimum error entropy (MEE) criteria [2527, 2931]. Though MEE-based algorithms may lead to good estimation performance, they require a prohibitive computational effort for real-time application. MCC-based algorithms are less complex and have been successfully applied to channel estimation [26]. The MCC algorithm employs the correntropy as a local similarity measure, which tends to increase the robustness of the algorithm to outliers. It has been applied to channel estimation under NGWNE [26, 32, 33]. More recently, a generalized correntropy measure has been proposed which replaces the Gaussian kernel of the MCC with a generalized Gaussian kernel. The generalized MCC (GMCC) algorithm has been shown to outperform MCC in certain situations at the cost of an increase in computational complexity [34]. A normalized version of the MCC algorithm (NMCC) has also been proposed [35, 36] to make the convergence control less dependent on the input power.

To address the issue of sparse channel estimation in non-Gaussian noise, sparse MCC algorithms have been proposed by incorporating different norm penalties into the MCC cost function. Similar to the zero attracting LMS (ZA-LMS) algorithm [16], in [33], an l1-norm penalty was introduced into the MCC cost function as a zero attractor. The resulting algorithm was called zero-attracting MCC (ZA-MCC) algorithm. Then, a reweighting controlling factor ξ was incorporated into the ZA-MCC algorithm, yielding the reweighting ZA-MCC (RZA-MCC) algorithm [33]. The ZA-MCC and RZA-MCC were shown to improve the steady-state and convergence performances of MCC in estimating multi-path channels. Different zero attractors were added to MCC in [33, 37, 38] using an l0-norm and a correntropy-induced metric (CIM).

One drawback of the ZA-MCC algorithms is that it uniformly attracts all the channel coefficients to zero, which may result in a rather large estimation error for less sparse channels. Also, the selection of a good learning rate is not simple for these algorithms. Zero-attracting NMCC (ZA-NMCC) and reweighting ZA-MCC (RZA-NMCC) algorithms have also been developed in [39]. Moreover, a soft parameter function has also been introduced into the NMCC algorithm to attenuate the uniform attraction of coefficients to zero. The resulting algorithm is called SPF-NMCC [40]. The SPF-NMCC algorithm improved the steady-state performance of NMCC, but tends to converge slowly due to the large discrepancy between coefficient values in sparse channels. To remedy this drawback, proportionate-type [41, 42] NMCC algorithms were proposed. The proportionate NMCC (PNMCC) algorithm was developed in [43, 44] and led to faster convergence speeds than the MCC and NMCC algorithms since the filter coefficients are proportionately updated according to the magnitude of the estimates of the channel coefficients. Unfortunately, however, the convergence of the PNMCC algorithm tends to slow down an initial convergence period [43, 44], a known effect of the proportionate type of adaptation.

In this paper, we propose a new adaptive algorithm by effectively combining some of the aforementioned strategies to derive a method that combines the properties of the l0-norm, the l1-norm, and the proportionate adaptation. This is done by integrating an adaptive combination function (ACF) to the PNMCC cost function to create a new adaptive zero attractor. The resulting algorithm is called Adaptive Combination Constrained Proportionate Normalized Maximum Correntropy Criterion (ACC-PNMCC) algorithm. The performance of the proposed ACC-PNMCC algorithm is investigated through several different simulation experiments, and it is compared to the MCC, NMCC, PNMCC, and sparse PNMCC algorithms in terms of the SSE and convergence speed. The simulation results illustrate the potential of the ACC-PNMCC algorithm to provide a superior estimation performance for sparse multi-path channel estimation under non-Gaussian and impulsive noise environments.

The paper is organized as follows. The NMCC and PNMCC algorithms are briefly reviewed in Section 2. In Section 2, we derive the proposed ACC-PNMCC algorithm in the context of sparse multi-path channel estimation. Simulation experiments are presented in Section 3 to illustrate the algorithm performance, and we finally summarize our work in Section 4.

2 Channel estimation algorithms

2.1 The NMCC and PNMCC algorithms

Consider the estimation of an unknown channel with impulse response given by g0=[g1,g2,g3,⋯,gM]T. The M-dimensional input vector is \(\textbf {x}\left ({n}\right) = {\left [{x\left ({n}\right),x\left ({n - 1}\right),x\left ({n - 2}\right), \cdots \!,x\left ({n - M + 1}\right)} \right ]^{^{T}}}\), and the observed channel output is d(n)=xT(n)g0+r(n), where r(n) is the additive noise. Denoting \(\hat {\mathbf {g}}\left (n \right)\) the channel response estimate, the estimation error is given by \(e\left ({n}\right) = d\left ({n}\right) - {\textbf {x}^{{T}}}\left ({n}\right)\hat {\mathbf {g}}\left ({n}\right)\).

It has been show in [36] that the NMCC weight update equation is the solution of the following problem:

$$ \begin{array}{l} \min \frac{1}{2}{\left\| {\hat{\mathbf{g}}\left({n + 1} \right) - \hat{\mathbf{g}}\left({n}\right)} \right\|^{2}}\\ {\mathrm{subject \ to}}\ \hat e\left({n}\right) = \left[ 1 - \alpha_{\text{NMCC}} \exp \left({ - \frac{{{e^{2}}\left({n}\right)}}{{2{\sigma^{2}}}}}\right)\right]e\left({n}\right) \end{array} $$
(1)

where ∥·∥2 represents the Euclidean norm of a vector, \(\hat e\left (n \right) = d\left (n \right) - {\textbf {x}^{{T}}}\left (n \right)\hat {\mathbf {g}}\left ({n + 1} \right)\), the parameter σ>0 is the bandwidth of the Gaussian kernel used to evaluate the instantaneous correntropy between the observed signal and the estimator output [26], and αNMCC>0 is a design parameter that affects the quality of estimation.

Solving (1) using the Lagrange multiplier technique [36] leads to the NMCC weight update equation

$$ \hat{\mathbf{g}}\left({n + 1} \right) = \hat{\mathbf{g}}\left(n \right) + {\alpha_{{\text{NMCC}}}}\frac{{\exp \left({ - \frac{{{e^{2}}\left(n \right)}}{{2{\sigma^{2}}}}} \right)}}{{{{\left\| {\textbf{x}\left(n \right)} \right\|}^{2}}}}e\left(n \right)\textbf{x}\left(n \right), $$
(2)

where αNMCC is the NMCC step size. If αNMCC=βx(n)∥2, (1) becomes the MCC algorithm update [26].

One way to modify the NMCC algorithm to improve adaptation performance for sparse channels is to adjust each adaptive weight using a different step size, what can be done through a diagonal step size matrix

$$ \mathbf{G}\left({n}\right) = {\text{diag}}\left({{m_{1}}\left({n}\right),{m_{2}}\left({n}\right),{m_{3}}\left({n}\right),...,{m_{M}}\left({n}\right)} \right). $$
(3)

The proportionate NMCC (PNMCC) algorithm assigns the step sizes mi such that

$$ {m_{i}}\left({n}\right) = \frac{{{\gamma_{i}}\left({n}\right)}}{{\sum\limits_{i = 1}^{M} {{\gamma_{i}}\left({n}\right)} }},1 \le i \le M $$
(4)

where γi(n), i=1,…,M are given by

$$ {\gamma_{i}}\left({n}\right) = \max \left[{\kappa \max \left[{\eta,\left| {{\hat{g}_{1}}\left({n}\right)} \right|,...,\left| {{\hat{g}_{M}}\left(n \right)} \right|}\right],\left| {{\hat{g}_{i}}\left({n}\right)} \right|} \right] $$
(5)

with κ and η positive constants with typical values of κ=5/M and η=0.01 [4245]. Parameter κ is used to prevent the coefficients from stalling when they are much smaller than the largest one, while η avoids the stalling of all coefficients when \(\hat {\mathbf {g}}\left (n \right) = {\textbf {0}_{M \times 1}}\) at initialization. The weight update equation for the PNMCC algorithm is

$$ {} \hat{\mathbf{g}}\left({n\! +\! 1} \right) \,=\, \hat{\mathbf{g}}\left({n}\right) + {\alpha_{{\text{PNMCC}}}} \frac{{\textbf{G}\left({n}\right)\exp \left({ \,-\, \frac{{{e^{2}}\left(n \right)}}{{2{\sigma^{2}}}}} \right)}}{{{\mathbf{x}^{T}}\left({n}\right)\mathbf{G}\left({n}\right)\mathbf{x}\left({n}\right) \,+\, \theta}}e\left({n}\right)\mathbf{x}\left({n}\right), $$
(6)

where θ is a regularization parameter, and αPNMCC is the PNMCC step size. Although the PNMCC algorithm improves the performance of the NMCC, its convergence rate tends to slow down after an initial period of fast convergence, a well known issue affecting the conventional (MSE based) PNLMS algorithm [21]. This basically happens due to the dominant effect of the inactive taps, whose step sizes tend to be excessively reduced as their weight estimates get closer to zero, reducing the convergence speed.

Among the several different strategies proposed to increase the PNLMS convergence rate after the initial convergence period, the l1 norm penalty recently proposed in [21] has led to very interesting results. In [21], the l1 norm of the weight vector pre-multiplied by the inverse of the step size matrix has been used. This makes the l1 penalty to be governed primarily by the inactive taps. The constrained optimization problem solved in [21] to yield the ZA-PNLMS algorithm is

$$ \begin{array}{l} {} \hat{\mathbf{g}}(n\,+\,1) \,=\, \underset{\hat{\mathbf{g}}(n + 1)}{\min} \| \hat{\mathbf{g}}(n+1) \,-\, \hat{\mathbf{g}}(n) \|_{\mathbf{G}^{-1}}^{2} + \gamma \| \mathbf{G}^{-1} \hat{\mathbf{g}}(n+1)\|_{1} \\ {\mathrm{subject \ to}}\ d(n) - \mathbf{x}^{T}(n) \hat{\mathbf{g}}(n+1) = 0. \end{array} $$

One problem of enforcing sparsity through an l1 norm constraint is that the resulting algorithm may lose performance when identifying responses with different levels of sparsity. One approach recently proposed in [46] to mitigate this problem for the LMS algorithm is to employ a so-called non-uniform norm constraint, inspired in the p-norm frequently used in compressive sensing. The sparsity of the weight vector is enforced through an adaptive combination function (ACF) that is a mixture of l0-norm and l1-norm and whose general form is

$$ \left\| {\hat{\mathbf{g}}(n)} \right\|_{q}^{q} = \sum\limits_{i = 1}^{M} {{{\left| {{{\hat{g}}_{i}}(n)} \right|}^{q}}} $$
(7)

where 0≤q≤1. We can see that the definition of the ACF is different from the classic Euclidean norm [47]. When q→0, it becomes to be

$$ \underset{q \to 0}{\lim} \left\| {\hat{\mathbf{g}}(n)} \right\|_{q}^{q} = {\left\| {\hat{\mathbf{g}}(n)} \right\|_{0}} $$
(8)

which approximates to be l0-norm that can count the number of non-zero coefficients in the sparse multi-path channels. Besides, the ACF is the l1-norm, which is described as

$$ \underset{q \to 1}{\lim} \left\| {\hat{\mathbf{g}}(n)} \right\|_{q}^{q} = {\left\| {\hat{\mathbf{g}}(n)} \right\|_{1}} = \sum\limits_{i = 1}^{M} {\left| {{{\hat{g}}_{i}}(n)} \right|}. $$
(9)

2.2 The proposed ACC-PNMCC algorithm

To address the problems of estimating a sparse response with possibly different levels of sparsity, obtaining a good convergence rate from initialization to steady-state, and providing robustness to impulsive noise, we propose to integrate an adaptive combination function (ACF) into the PNMCC’s cost function. This approach leads to the design of a new correntropy-based zero attractor algorithm that combines the techniques of [21, 4650]. The proposed ACC-PNMCC algorithm is derived in the following.

Integrating the ACF into the cost function of the PNMCC algorithm with the weight scaling as in [21], we propose the following optimization problem:

$$ {{} \begin{aligned} &&\underset{\hat{\mathbf{g}}(n + 1)}{\min} \frac{1}{2}\| \hat{\mathbf{g}}(n + 1) \,-\, \hat{\mathbf{g}}(n) \|_{\mathbf{G}^{- 1}(n)}^{2} \,+\, {\gamma_{\text{ACC}}}\left\| {{\mathbf{G}^{- 1}}(n)\hat{\mathbf{g}}\left({n \,+\, 1} \right)} \right\|_{q}^{q}\\ &&{{\text{subject}}\;{\text{to}}\;\hat e(n) = \left[{1 - \alpha \exp \left({- \frac{{{e^{2}}(n)}}{2{\sigma^{2}}}} \right)} \right]e(n)} \end{aligned}} $$
(10)

where γACC is a trade-off parameter between the steady-state performance and the ACF penalty. Using Lagrange multipliers, the cost function associated with this optimization problem is

$$ {{} \begin{aligned} {J_{{\text{ACC}}}}\left({n \,+\, 1} \right) \,=\, &\frac{1}{2}{\left({\hat{\mathbf{g}}\left({n \,+\, 1} \right) \,-\, \hat{\mathbf{g}}(n)} \right)^{T}}{\mathbf{G}^{\,-\, 1}}(n)\left({\hat{\mathbf{g}}\left({n \,+\, 1} \right) - \hat{\mathbf{g}}(n)} \right) \\ &+ {\gamma_{{\text{ACC}}}}\left\| {{{\mathbf{G}^{- 1}}(n)\hat{\mathbf{g}}}\left({n + 1} \right)} \right\|_{q}^{q}\\ &+ {\lambda_{{\text{ACC}}}}\left[ {\hat e(n) \,-\, \left[{1 \,-\, \alpha \exp \left({ - \frac{{{e^{2}}(n)}}{{2{\sigma^{2}}}}} \right)} \right]e(n)} \right] \end{aligned}} $$
(11)

where λACC is the Lagrange multiplier.

As a first step to minimize (11) with respect to \({\hat {\mathbf {g}}\left ({n + 1} \right)}\), we impose

$$ \frac{{\partial {J_{\text{ACC}}}\left({n + 1}\right)}}{\partial \hat{\mathbf{g}}\left({n + 1}\right)} = \bf{0} $$
(12)

which yields the recursion

$$ \begin{aligned} \hat{\mathbf{g}}\left({n + 1} \right) &= \hat{\mathbf{g}}(n) + {\lambda_{\text{ACC}}}\mathbf{G}(n)\mathbf{x}(n) \\&\quad- {\gamma_{{\text{ACC}}}}\mathbf{G}(n)\frac{{\partial \left\| {{\mathbf{G}^{- 1}}(n)\hat{\mathbf{g}}\left({n + 1} \right)} \right\|_{q}^{q}}}{{\partial \hat{\mathbf{g}}\left({n + 1} \right)}}. \end{aligned} $$
(13)

Now, imposing the equality constraint

$$ {\hat e(n) = \left[{1 - \alpha \exp \left({ - \frac{{{e^{2}}(n)}}{{2{\sigma^{2}}}}} \right)} \right]e(n)}. $$
(14)

and solving for λACC yields

$$ {{} \begin{aligned} {\lambda_{\text{ACC}}} = \frac{{\alpha \exp \left({ - \frac{{{e^{2}}(n)}}{{2{\sigma^{2}}}}} \right)e(n) + {\gamma_{{\text{ACC}}}}{\mathbf{x}^{{T}}}(n)\mathbf{G}(n)\frac{{\partial \left\| {{\mathbf{G}^{- 1}}(n)\hat{\mathbf{g}}\left({n + 1} \right)} \right\|_{q}^{q}}}{{\partial \hat{\mathbf{g}}\left({n + 1} \right)}}}}{{{\mathbf{x}^{T}}(n)\mathbf{G}(n)\mathbf{x}(n)}}. \end{aligned}} $$
(15)

Substituting (15) in (13) and rearranging the terms yields

$$ {{} \begin{aligned} \hat{\mathbf{g}}\left({n + 1} \right)=&\hat{\mathbf{g}}(n) + {\alpha_{{\text{ACC}}}}\frac{{\exp \left({ - \frac{{{e^{2}}(n)}}{{2{\sigma^{2}}}}} \right)e(n)\mathbf{G}(n)\mathbf{x}(n)}}{{{\mathbf{x}^{T}}(n)\mathbf{G}(n)\mathbf{x}(n)}} \\ &{ - {\gamma_{{\text{ACC}}}}\left\{ {\mathbf{I} - \frac{{\mathbf{x}(n){\mathbf{x}^{T}}(n)\mathbf{G}(n)}}{{\mathbf{x}^{T}}(n)\mathbf{G}(n)\mathbf{x}(n)}} \right\}\mathbf{G}(n)\frac{{\partial \left\| {{\mathbf{G}^{- 1}}(n)\hat{\mathbf{g}}\left({n + 1} \right)} \right\|_{q}^{q}}}{\partial \hat{\mathbf{g}}\left({n + 1} \right)}}. \end{aligned}} $$
(16)

In (16), we observe that the elements of the matrix \({ - \frac {{\textbf {x}(n){\textbf {x}^{T}}(n)\textbf {G}(n)}}{{{\textbf {x}^{T}}(n)\textbf {G}(n)\textbf {x}(n)}}}\) tend to be very small for reasonably large values of M. Hence, we simplify the algorithm recursion using the approximation \(\textbf {I}{ - \frac {{\textbf {x}(n){\textbf {x}^{T}}(n)\textbf {G}(n)}}{{{\textbf {x}^{T}}(n)\textbf {G}(n)\textbf {x}(n)}}}\approx \textbf {I}\). The same approximation has been successfully used in [21]. Then, (16) becomes

$$\begin{array}{*{20}l} \hat{\textbf{g}}\left({n + 1} \right)=&\hat{\textbf{g}}(n) + {\alpha_{{\text{ACC}}}}\frac{{\exp \left({ - \frac{{{e^{2}}(n)}}{{2{\sigma^{2}}}}} \right)e(n)\textbf{G}(n)\textbf{x}(n)}}{{{\textbf{x}^{T}}(n)\textbf{G}(n)\textbf{x}(n)}} \\ & - {\gamma_{{\text{ACC}}}}\,\textbf{G}(n)\frac{{\partial \left\| {{\textbf{G}^{- 1}}(n)\hat{\textbf{g}}\left({n + 1} \right)} \right\|_{q}^{q}}}{{\partial \hat{\textbf{g}}\left({n + 1} \right)}} \end{array} $$
(17)

Evaluation of the gradient vector in (17) yields

$$ {{} \begin{aligned} \frac{{\partial \left\| {{\mathbf{G}^{- 1}}(n)\hat{\mathbf{g}}\left({n + 1} \right)} \right\|_{q}^{q}}}{{\partial \hat{\mathbf{g}}\left({n + 1} \right)}} = q\, \mathbf{G}^{-q}(n) \, {\boldsymbol{\psi}}(n+1) \odot \text{sgn}\left(\hat{\mathbf{g}}(n+1)\right) \end{aligned}} $$
(18)

where ⊙ denotes the Hadamard product, and

$$ {\boldsymbol{\psi}}(n+1) = \left[ |\hat{g}_{1}(n+1)|^{q-1}, \dots, |\hat{g}_{1}(n+1)|^{q-1} \right]^{T}. $$

As (18) is a function of \(\hat {\mathbf {g}}(n+1)\), it substitution in (17) would not yield a recursive update for \(\hat {\mathbf {g}}(n)\). To obtain an implementable recursive update, we replace \(\hat {\mathbf {g}}(n+1)\) in (18) with \(\hat {\mathbf {g}}(n)\). This approximation tends to be reasonable for practical step sizes. The resulting weight update equation is then

$$\begin{array}{*{20}l} \hat{\mathbf{g}}\left({n + 1} \right)=&\hat{\mathbf{g}}(n) + {\alpha_{{\text{ACC}}}}\,\frac{ {\exp \left({ - \frac{{{e^{2}}(n)}}{{2{\sigma^{2}}}}} \right)e(n)}}{ {{\mathbf{x}^{T}}(n)\,\mathbf{G}(n)\mathbf{x}(n)}}\mathbf{G}(n)\mathbf{x}(n) \\ &- \gamma_{\text{ACC}} \, q\, \!\left(\mathbf{G}^{-1}(n)\right)^{q-1} \,{\boldsymbol{\psi}}(n) \odot \text{sgn}\!\left(\hat{\mathbf{g}}(n)\right). \end{array} $$
(19)

Though recursion (19) should fulfill the objective of minimizing (11) at convergence, it is still too complex for real-time implementations due to the last term on the r.h.s., which is responsible for the zero-attracting property of the algorithm. Since we propose to use either an l0-norm or an l1-norm, q must assume a value equal to zero or one, respectively. For q=0, the last term in (19) is equal to zero. For q=1, it is equal to \(-\gamma _{\textrm {ACC}}\, {\boldsymbol {\psi }}(n) \odot \text {sgn}\left (\hat {\mathbf {g}}(n)\right)\), which is a vector whose ith element is given by \(-\gamma _{\textrm {ACC}}\,\text {sgn}\left (\hat {g}_{i}(n)\right)\).

A final decision to be made is how to choose the value of q. According to the definitions (8) and (9), it should affect the norm of the whole vector \(\hat {\mathbf {g}}(n)\). However, there is no clear measure to set the value of q to obtain good performance in practical applications. We propose instead to determine a different value of q for each element of \(\hat {\mathbf {g}}(n)\) and based on a threshold that depends on the recent behavior of each weight estimate:

$$ {h_{i}}(n) = E\left[ {\left| {{{\hat{g}}_{i}}(n)} \right|} \right],\forall 1 \le i \le M. $$
(20)

Similar solutions have been employed in [47, 49, 50] to regularize different model misfit cost functions.

Given (20), and denoting becomes qi the value of q applied to qi(n), we set qi=0 if \({{\hat {g}}_{i}}(n) > {h_{i}}(n)\), and qi=1 if \({{\hat {g}}_{i}}(n) < {h_{i}}(n)\). Hence, those coefficients for which \({{\hat {g}}_{i}}(n) > {h_{i}}(n)\) will contribute to this term as if the norm l0 had been used. Conversely, those coefficients for which \({{\hat {g}}_{i}}(n) < {h_{i}}(n)\) will contribute to this term as if the norm l1 had been used.

We implement this strategy by defining a diagonal matrix

$$ \mathbf{F} = {\text{diag}}\left({{f_{1}},{f_{2}},{f_{3}},...,{f_{M}}} \right), $$
(21)

with elements

$$ {f_{i}} = \frac{{{\text{sgn}} \left[ {{h_{i}}(n) - \left| {{{\hat{g}}_{i}}(n)} \right|} \right] + 1}}{2},\qquad \forall 1 \le i \le M $$
(22)

so that the last r.h.s. term of (19) becomes \( - {\gamma _{{\textrm {ACC}}}}\,\textbf {F}\,{\text {sgn}} \left ({\hat {\mathbf {g}}(n)} \right)\).

Finally, it has been verified [50] that the zero-attracting ability of \( - {\gamma _{\textrm {ACC}}}\textbf {F}{\text {sgn}} \left ({\hat {\mathbf {g}}(n)} \right)\) can be improved by a reweighting factor used in the RZA-MCC algorithm. By doing that, the updating equation of the proposed ACC-PNMCC algorithm becomes

$$ {\begin{aligned} \hat{\mathbf{g}}\left({n + 1} \right)=&\hat{\mathbf{g}}(n) + {\alpha_{\text{ACC}}}\frac{{\exp \left({ - \frac{{{e^{2}}(n)}}{{2{\sigma^{2}}}}} \right)e(n)\textbf{G}(n)\textbf{x}(n)}}{{\textbf{x}^{T}}(n)\textbf{G}(n)\textbf{x}(n) + \theta}\\[-4pt] & - {\gamma_{\text{ACC}}}\,\textbf{F}\,\frac{{{\text{sgn}}\left({\hat{\mathbf{g}}(n)} \right)}}{{1 + \xi \left| {\hat{\mathbf{g}}(n)} \right|}} \end{aligned}} $$
(23)

where θ>0 is a very small positive constant, ξ is the reweighting factor, and αACC denotes the step size.

The last r.h.s. term in (23) is the newly designed zero attractor. Each coefficient is treated as if a different norm penalty had been applied to the cost function. This is done by comparing each channel coefficient estimate to the designed threshold. In the following, we will show that the proposed zero attractor speeds-up the convergence of the small coefficients after the initial convergence so that the proposed ACC-PNMCC algorithm outperforms the PNMCC algorithm. In addition, a better sparse multi-path channel estimation performance can be obtained by properly selecting the value of the reweighting factor ξ.

3 Results and discussions

In this section, several different simulation experiments are carried out to investigate the performance of the proposed ACC-PNMCC algorithm. The input signal x(n) is a random signal, while the noise signal r(n) is a mixed Gaussian signal with distribution [33, 37, 40]

$$ \left({1 - \chi} \right)N\left({{\mu_{1}},\nu_{1}^{2}} \right) + \chi N\left({{\mu_{2}},\nu_{2}^{2}} \right), $$
(24)

where \( N\left ({{\mu _{i}},\nu _{i}^{2}} \right)\left ({i = 1,2} \right)\) denotes a Gaussian distribution with mean μi and variance \({\nu _{i}^{2}}\). The mixing parameter χ is used to balance the two Gaussian noises. In our simulation experiments, these parameters are set to be \(\left ({{\mu _{1}},{\mu _{2}},\nu _{1}^{2},\nu _{2}^{2},\chi } \right) = \left ({0,0,0.05,20,0.05} \right)\).

The performance of the proposed ACC-PNMCC algorithm is evaluated through the steady-state mean square deviation (MSD), which is defined as

$$ {\text{MSD}}\left({\hat{\mathbf{g}}(n)} \right){\mathrm{= E}}\left[{{{\left\| {\textbf{g} - \hat{\mathbf{g}}(n)} \right\|}^{2}}} \right]. $$
(25)

From the updating Eq. (23), we notice that there are several key parameters which may affect the ACC-PNMCC performance and thus must be properly selected. To better select these parameters, we experimentally analyzed their effect on the MSD performance of the ACC-PNMCC algorithm. Herein, the regularization parameters include αACC, γACC, and ξ. In this paper, each parameter is optimized to obtain small MSD. At each optimization, only one parameter is changed while other parameters are set to the optimal values. In the experiments designed for this study, the number of nonzero coefficients is 1 and the total length of the unknown channel is set to be M=16. Firstly, γACC is studied under different SNRs [51]. In this simulation, we have set αACC=0.27 and ξ=5. The simulation results are shown in Fig. 1. It is observed that the MSD with different values of γACC decreases as the SNR increases from 1 to 40 dB. The SNR increases with different slopes for different values of γACC. This means that the effect of γACC is dependent on the SNR. As can be verified from Fig. 1, γACC=5×10−4 yields the smallest MSD for a SNR of 30 dB. Hence, we study the effect of the step size value αACC on the MSD performance for SNR = 30 dB and γACC=5×10−4. The corresponding MSD is shown in Fig. 2. We can verify that the MSD gradually decreases as the step size increases from 1×10−3 to 5×10−2, while the MSD increases for step sizes greater than 5×10−2. This indicates the importance of a proper choice of step size value. The effect of reweighting factor ξ is studied for a simple case on the zero attractor term. The results are shown in Fig. 3. From Fig. 3, we note that the reweighting factor mainly attracts to zero the coefficients that are smaller than the defined threshold, while the zero attractor term becomes zero for the coefficients with values greater than the threshold. In addition, the shrinkage ability of the reweighting factor decreases with ξ ranging from 0 to 25. Thus, the reweighting factor in the proposed ACC-PNMCC algorithm exerts strong zero attraction on the relatively small coefficients to improve their convergence after the initial transient.

Fig. 1
figure 1

The MSD performance of the proposed ACC-PNMCC algorithm with different γACC

Fig. 2
figure 2

The effects of αACC on the MSD performance of the proposed ACC-PNMCC algorithm at SNR = 30 dB and γACC=5 × 10−4

Fig. 3
figure 3

The effects of ξ on the zero-attracting ability of the proposed ACC-PNMCC algorithm

Based on the above parameter selections, the MSD performance of the proposed ACC-PNMCC algorithm was verified for different sparsity levels at SNR = 30 dB. For comparison purposes, the conventional MCC, NMCC, PNMCC, and sparse PNMCC algorithms were also considered in the simulations. The sparse PNMCC (ZA-PNMCC and RZA-PNMCC) algorithms were obtained by using the l1-norm and are briefly described in the Appendix. In this experiment, the sparsity level of sparse multi-path channel was set as K. In other words, K represents the number of nonzero coefficients in the sparse multi-path channel. Firstly, there is only one nonzero channel coefficient (K=1), which is randomly distributed within the unknown sparse multi-path channel. To obtain the same initial convergence speed for all the compared algorithms, the related parameters for the MCC, NMCC, PNMCC, ZA-PNMCC, and RZA-PNMCC algorithms were set as follows: αMCC=0.03, αNMCC=0.4, αPNMCC=αZA=αRZA=0.3, γZA=5×10−5, γRZA=1×10−4, αACC=0.27, γACC=5×10−4, and ξ=5. αMCC, αZA, and αRZA are the step sizes of the MCC, ZA-PNMCC, and RZA-PNMCC algorithms, respectively. Moreover, γZA and γRZA are the trade-off parameters of the ZA-PNMCC and RZA-PNMCC algorithms, respectively. The MSD performance of the proposed ACC-PNMCC algorithm for K=1 is given in Fig. 4. It is observed that ACC-PNMCC achieves the lowest steady-state MSD when all the compared algorithms have the same initial convergence speed. Then, the corresponding steady-state MSDs for K=2, 4, and 8 are shown in Figs. 5, 6, and 7. We note that the steady-state MSDs of the ZA-PNMCC, RZA-PNMCC and ACC-PNMCC algorithms increased with the sparsity level increasing from K=1 to K=8. However, it is worth noting that the MSD of the ACC-PNMCC algorithm is still lower than those of the other algorithms. In addition, the effects on the ACC-PNMCC algorithm with a sparsity level K/M are shown in Fig. 8 to more intuitively understand the effect of sparsity level on the MSDs of compared algorithms. It is found that the MSDs of ZA-PNMCC, RZA-PNMCC, and ACC-PNMCC gradually increase as the sparsity level K/M increases from 0.0625 to 0.5, which is similar to the results in Figs. 4, 5, 6, and 7. The MSDs of the MCC, NMCC, and PNMCC algorithms remain almost invariant, as they do not have any zero attractor term that is sensitive to sparsity K/M. However, one should note that the MSDs of ZA-PNMCC and ACC-PNMCC are almost equal to 1×10−4 when K/M=0. This is because the ACC-PNMCC algorithm uses l1-norm when K/M=0, like the ZA-PNMCC algorithm. When K/M is greater than 0.0625, the ACC-PNMCC algorithm is always better than the other algorithms. Thus, we can say that the ACC-PNMCC algorithm is suitable for sparse multi-path channel estimation in practical applications.

Fig. 4
figure 4

The MSD performance of the proposed ACC-PNMCC algorithm with K=1

Fig. 5
figure 5

The MSD performance of the proposed ACC-PNMCC algorithm with K=2

Fig. 6
figure 6

The MSD performance of the proposed ACC-PNMCC algorithm with K=4

Fig. 7
figure 7

The MSD performance of the proposed ACC-PNMCC algorithm with K=8

Fig. 8
figure 8

The MSD performance of the proposed ACC-PNMCC algorithm with different coefficient sparsities

The third experiment considered the application of ACC-PNMCC to echo cancelation. The typical echo path of a 256-tap channel with 16 non-zero coefficients is shown in Fig. 9, which is also known as block-sparse path [52]. However, the data reusing, sign, and memory schemes have been considered in [52], which has high complexity. Additionally, the sparsity is not given by a formula. Herein, the sparsity characteristic of the echo channel is measured by \({\vartheta _{12}} = \frac {M}{{M - \sqrt {M}}}\left ({1 - {{{{\left \| \textbf {g} \right \|}_{1}}} / {\sqrt {M}{{\left \| \textbf {g} \right \|}_{2}}}}}\right)\) [40, 45, 49]. In this experiment, we use 𝜗12=0.8222 for the first 8000 iterations, while 𝜗12=0.7362 are set for the after 8000 iterations. Other simulation parameters are αMCC=0.0055, αNMCC=1.3, αPNMCC=αZA=αRZA=0.9, γZA=1×10−6, γRZA=1.5×10−6, αACC=0.8, γACC=5×10−6, and ξ=5. The corresponding steady-state MSDs are shown in Fig. 9. It can be seen that ACC-PNMCC outperforms the other algorithms in terms of both steady-state MSD and convergence speed. Although the sparsity is changed from 0.8222 to 0.7362, the performance of the proposed ACC-PNMCC algorithm is still superior to that of the other algorithms, indicating the effectiveness of the ACC-PNMCC algorithm for echo cancelation.

Fig. 9
figure 9

The MSD performance of the proposed ACC-PNMCC algorithm in echo cancelation application

At last, the computational complexity of the proposed ACC-PNMCC algorithm is presented in Table 1 in comparison with the corresponding algorithms. Herein, the numerical complexities of the algorithms include multiplications, additions, exponentiations, divisions, and comparisons. Form Table 1, it can be seen that the computational complexity of the developed ACC-PNMCC algorithm is slightly higher than that of the ZA-PNMCC and RZA-PNMCC algorithms, which is owing to the calculation of the ACF. However, the proposed ACC-PNMCC algorithm obviously increases the convergence speed and reduces the MSD.

Table 1 Numerical complexities of the algorithms

Based on the above experiment analysis, we infer that the proposed ACC-PNMCC algorithm has a superior steady-state MSD performance and convergence speed for sparse multi-path channel estimation applications. This is because ACC-PNMCC utilizes the l0-norm penalty for the channel coefficients which are larger than our designed threshold, while it exerts the l1-norm penalty on the channel coefficients that are smaller than the designed threshold to attract relatively small coefficients to zero to improve the convergence. Thus, the proposed ACC-PNMCC algorithm performed better than the other algorithms for sparse channel estimation.

4 Conclusions

In this paper, an ACC-PNMCC algorithm has been proposed for sparse multi-path channel estimation. The proposed ACC-PNMCC algorithm exploits inherent sparsity features of sparse multi-path channels by utilizing the designed zero attractor. The performance of the algorithm was investigated and compared with the performances of the MCC, NMCC, PNMCC, and sparse PNMCC algorithms for sparse multi-path channel estimation. Experimental results illustrated that the proposed ACC-PNMCC algorithm is superior to the competing algorithms in terms of both the steady-state MSD and convergence speed.

5 Appendix

Similar to the ZA-MCC and ZA-LMS algorithms [16, 33, 37, 38], the l1-norm is introduced into the cost function of the PNMCC algorithm to develop the zero-attracting PNMCC (ZA-PNMCC) algorithm. The cost function of the ZA-PNMCC algorithm is

$$ {{} \begin{aligned} {J_{\text{ZA}}}\left({n + 1} \right) = {\left({\hat{\mathbf{g}}\left({n + 1} \right) - \hat{\mathbf{g}}(n)} \right)^{T}}{\textbf{G}^{- 1}}(n)\left({\hat{\mathbf{g}}\left({n + 1} \right) - \hat{\mathbf{g}}(n)} \right)\\ + \;{\gamma_{\text{ZA}}}{\left\| {{\textbf{G}^{- 1}}(n)\hat{\mathbf{g}}\left({n + 1} \right)} \right\|_{1}} + {\lambda_{\text{ZA}}}\left[{\hat e(n) - \left[ {1 - \alpha \exp \left({ - \frac{{e^{2}}(n)}{2{\sigma^{2}}}} \right)} \right]e(n)} \right], \end{aligned}} $$
(26)

where λZA is Lagrange multiplier. By using LMM, the updating equation of the ZA-PNMCC can be written as

$$ {{} \begin{aligned} \hat{\mathbf{g}}\left({n + 1} \right) \,=\, \hat{\mathbf{g}}(n) + {\alpha_{\text{ZA}}}\frac{{\exp \left({ - \frac{{{e^{2}}(n)}}{{2{\sigma^{2}}}}} \right)e(n)\textbf{G}(n)\mathbf{x}(n)}}{{{\mathbf{x}^{T}}(n)\mathbf{G}(n)\mathbf{x}(n) + \theta}} - {\gamma_{{\text{ZA}}}}{\text{sgn}}\left({\hat{\mathbf{g}}(n)} \right), \end{aligned}} $$
(27)

where αZA and γZA are the step size and trade-off parameter of the ZA-PNMCC algorithm, respectively. Similarly to the RZA-MCC and RZA-LMS algorithms [16, 33, 37, 38], a reweighting factor is introduced into the ZA-PNMCC algorithm to develop the reweighting ZA-PNMCC (RZA-PNMCC) algorithm, the corresponding updating equation is

$$ \begin{aligned} \hat{\mathbf{g}}\left({n + 1} \right) &= \hat{\mathbf{g}}(n) + {\alpha_{{\text{RZA}}}}\frac{{\exp \left({ - \frac{{{e^{2}}(n)}}{{2{\sigma^{2}}}}} \right)e(n)\mathbf{G}(n)\mathbf{x}(n)}}{{{\mathbf{x}^{T}}(n)\mathbf{G}(n)\mathbf{x}(n) + \theta}} \\&\quad- {\gamma_{\text{RZA}}}\frac{{{\text{sgn}} \left({\hat{\mathbf{g}}(n)} \right)}}{{1 + {\xi_{1}}\left| {\hat{\mathbf{g}}(n)} \right|}}, \end{aligned} $$
(28)

where ξ1 is reweighting factor, αRZA denotes step size and γRZA represents trade-off parameter of the RZA-PNMCC algorithm.