Advanced Topics and New Directions

  • Leonardo Rey Vega
  • Hernan Rey
Part of the SpringerBriefs in Electrical and Computer Engineering book series (BRIEFSELECTRIC)


In this final chapter we provide a concise and brief discussion of other topics not covered in the previous chapters. These topics are more advanced or are the object of active research in the area of adaptive filtering. A brief introduction to each topic and several relevant references for the interested reader are provided.


Wireless Sensor Network Posteriori Error Adaptive Filter Impulsive Noise Krein Space 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

6.1 Optimum \(H^{\infty }\) Algorithms

The LMS solves approximately the optimum linear mean square problem with a simple algorithm. The RLS has, under certain conditions, a better performance in terms of convergence speed and steady state error. However, both algorithms have to deal with perturbations in real-world environments. The concept of robustness is linked to the sensitivity of an algorithm to such perturbations. In particular, we focus now on the idea that an algorithm is robust if it does not amplify the energy of the perturbations [1] (this is not the same idea as the one used later in Sect. 6.4).

In the mid 1990s, Hassibi et al. presented a relationship between \(H\,^{\infty }\) optimal filters and Kalman filtering in Krein spaces [2]. They also proved that the robustness of the LMS algorithm is based on the fact that it is a minimax algorithm, i.e., an \(H\,^{\infty }\) optimal algorithm [3]. In addition, it turns out that the RLS is \(H\,^{\infty }\) suboptimal. \(H\,^{\infty }\) optimal filtering has also been done with different approaches [4]. However, the approach used by Hassibi et al. relies on an elegant theory revealing the relation between \(H\,^{\infty }\) optimal filtering and Kalman filtering in Krein spaces (which are indefinite metric spaces) [2].

The \(H\,^{\infty }\) optimality can be interpreted as this: the ratio between the energy of the estimation errors and the energy of the perturbations (and model uncertainties) is upper-bounded for every possible realization of the perturbations. Therefore, small perturbations lead to small estimation errors. By a proper definition of the errors and perturbation the regularized APA was found to be \(H\,^{\infty }\) optimal [5]. As the \(H\,^{\infty }\) optimality might be an infinite horizon problem, by following the ideas introduced in [6], local energy relations were found for the APA [5]. This relations guarantee a local robust behavior (robust at each time instant).

6.2 Variable Step Size/Regularization Adaptive Algorithms

As we have seen in Chap.  4, the choice of the step size \(\mu \) poses a tradeoff between speed of convergence and steady state error. To overcome this tradeoff, it seems natural to look for variable step size strategies. In this way, we expect to have a large step size at the beginning of the adaptation process (to improve the speed of convergence) and then decrease it as we get closer to the steady state (to allow for a smaller final misadjustment).

Instead of using \(e^2(n)\) as a measure of the proximity to the steady state (and adjust the stepsize accordingly), in [7] the authors proposed to use \(e(n)e(n-1)\). If the noise is i.i.d., the proposed measure would be a better estimator of the proximity to the steady state. In addition, several VSS algorithms may not work very reliably since they depend on several parameters that are not simple to tune in practice. To overcome this issue, a VSS algorithm is proposed in [8]. Remember that the NLMS can be seen as a VSS-LMS algorithm where the step size is chosen to nullify the a posteriori error. But as we mentioned, this would cause the filter to compensate the presence of the noise \(\textit{v}(n)\). Then, the idea in [8] is to find \(\mu (n)\) so that the power of the a posteriori error is equal to the one of the noise. This idea was further extended to the APA [9].

Several of the VSS algorithms proposed in the literature suffer from a drop in performance when the SNR is low. Large noise sample lead to large estimation errors, so a small value of \(\mu \) would be required to keep the algorithm stable. However, when the noise samples present small energy, the small value of \(\mu \) would make the convergence unnecessarily slow. In this way, the family of robust VSS algorithms introduced in [10, 11, 12, 13], can improve the steady state performance without compromising the speed of convergence (see Sect. 6.4 for further details).

Finally, it should be noticed that instead of varying the step size \(\mu \), the regularization parameter \(\delta \) could be varied. This is particularly suitable for APA, where the use of regularization can be essential when dealing with large filters and highly colored nonstationary signals, as in acoustic echo cancelation. The interested reader can find more information on variable step size algorithms in [14, 15, 16] and variable regularized algorithms in [5, 17, 18, 19].

6.3 Sparse Adaptive Filters

In several system identification applications, the system to be estimated is sparse. That is, it presents a small number of large coefficients, while the rest of them are either zero or have small magnitudes. In other words, these systems have the property of concentrating most of their energy in a small fraction of their coefficients. This is typically the case in applications such as acoustic echo cancellation [20] and wireless channel identification [21, 22]. In such systems, and assuming that the initial condition is zero, it is natural to think that the larger coefficients will need more time to converge than the ones which are very small or zero. As we saw in Chap.  4, the speed of convergence is governed by the step size \(\mu \). In [23], Duttweiler proposed to use different variable step sizes for each component of the adaptive filter \(\mathbf w (n)\). Then, the step size can be written as:
$$\begin{aligned} {\mathbf \mu }(n)=\text{ diag}\left(\mu _0(n),\mu _1(n),\dots ,\mu _{L-1}(n)\right). \end{aligned}$$
The dynamics of each step size \(\mu _i(n)\), \(i=0,\dots , L-1\) is governed by the dynamics of the adaptive filter \(\mathbf w (n)\) itself. The specific mathematical details can be checked in [23] but basically the step sizes \(\mu _i(n)\) are proportional to \(\textit{w}_i(n)\). In this way, the largest coefficients in \(\mathbf w (n)\) will have larger step sizes, improving their convergence speed. When the true system to be identified is sparse, this choice could lead to important savings in speed of convergence without compromising the steady state behavior, or even improving it. Several variants of this idea exist [24, 25, 26]. In [27], the potential sparsity of the system to be identified is exploited as a priori information, leading to a more elegant formulation and the obtention of other algorithms. The idea of exploiting sparsity as a priori information can be formalized using Riemannian manifolds [28, 29]. Recently, new results on sparse system identification have been obtained using ideas from compressed sampling [30]. The reader can consult [31, 32, 33, 34].

6.4 Robust Adaptive Filters

In real-world adaptive filtering applications, severe impairments may occur. Perturbations such as background and impulsive noise can deteriorate the performance of many adaptive filters under a system identification setup. Consider for example that the input-output pairs are related by the linear regression model \(d(n)=\mathbf w _{\mathrm T}^T\mathbf x (n)+\textit{v}(n)\), where \(\textit{v}(n)\) is additive noise. Consider also, for example, LMS recursion, which can be written as:
$$\begin{aligned} \mathbf w (n) = \mathbf w (n-1) + \mu \mathbf x (n) e(n). \end{aligned}$$
It is clear that \(e(n)=\tilde{\mathbf w }^T(n)\mathbf x (n)+\textit{v}(n)\), where \(\tilde{\mathbf w }(n)\) is the usual misalignment vector. Suppose then, that the adaptive filter has a given estimate of the true system at a certain time step. If a large noise sample perturbs it, the result will be a large change in the system estimate:
$$\begin{aligned} \Vert \mathbf w (n)-\mathbf w (n-1)\Vert \approx |\textit{v}(n)|. \end{aligned}$$
If the noise sample \(\textit{v}(n)\) is large, the closeness of \(\mathbf w (n-1)\) to \(\mathbf w _{\mathrm T}\) will be affected and the performance of the algorithm will be severely degraded. That is, the LMS is highly sensitive to large perturbations. Situations where large perturbations in the noise realization could appear with high probability are common.1 A typical example is the case of acoustic echo cancellation [20], where in the noise realization \(v(n)\) there might be a component associated to human speech from a local speaker. The human speech can have bursty components of high energy, which can be thought as a realization of impulsive noise. In that application, and with a local speaker (the double-talk situation), the performance of algorithms like the LMS or the NLMS2 could be very poor. Therefore, it is important to obtain robust algorithms. Here we interpret the term robust as “slightly sensitive to large perturbations (outliers)”.

It is known from Chap.  2 that the SEA presents robustness (in the sense defined above) to impulsive noise. However, we also saw that the SEA can present slow convergence speed. In order to obtain robustness without compromising the speed of convergence several approaches have been taken. One of them is the use of robust statistics [35, 36, 37, 38], where considerations about the statistics of the noise signal \(v(n)\) should be taken into account. Other useful approach is the use of mixed-norm algorithms, where the algorithms are derived from cost functions that combines \(|e(n)|^2\) and \(|e(n)|\) [39, 40]. As we saw in Chap.  2, the term \(|e(n)|^2\) is the cost function used to obtain the LMS algorithm, and \(|e(n)|\) is the cost function used to obtain the SEA. In a sense, \(|e(n)|\) provides robustness and \(|e(n)|^2\) improves the speed of convergence. The algorithms obtained are an appropriate weighted combination of the LMS (or NLMS) and the SEA. Other useful approach is the used of switched-norm algorithms [10]. In this way, the algorithm is able to determine if a robust behavior is needed or if the convergence speed can be improved. This decision is only determined by the instantaneous environment that the adaptive filter is facing, and without considerations about the perturbation statistics. For further extensions of this approach the reader can consult [11, 12, 13].

6.5 Distributed Adaptive Filtering

Wireless sensor networks (WSN) are a promising technology in various fields [41]. They comprises a large number of sensors which are able to communicate with each other wirelessly using a radio transceiver. They also have limited sensing and data processing capabilities. The sensors are deployed in large geographical areas, without any particular planning, in order to perform a particular task in a collaborative manner. Although each sensor unit has limited sensing and data processing capabilities, they are able to perform complex tasks. They compensate their limited individual abilities by collaborating between a large number of them. There exists several applications which could takes enormous benefits from WSN: surveillance, health care monitoring, disaster areas monitoring, etc [42].

In some WSN applications the sensor observations should be used for the estimation of network-wide parameter or signal of interest. In this way, an interesting distributed estimation problem is established: each sensor has to decide how to communicate its new observation to the other units, in order to improve the overall estimation capability of the network. At the same time, as the network should be operative for a large span of time with minimum or no maintenance, the energy consumption of each node should be kept to a minimum. As the radio transceiver is the most demanding energy part, each node should not transmit each of its observations, but an intelligent combination of them.

Adaptive filtering tasks can be extended to a WSN scenario. In distributed adaptive filtering, a network of sensors has to estimate a given common signal or identify a common parameter using noise corrupted local input-output pairs. With them, each node can perform an update of it local estimation an then transmit it to its neighbors. Incremental [43] and diffusion [44, 45] strategies can be implemented for the nodes updates interchange. The interested reader is referred to [46, 47, 48, 49] and the references therein.


  1. 1.

    This is the situation when the probability density of the noise has heavy tails [35].

  2. 2.

    Or any algorithm in which the update is linear in the error filtering signal.


  1. 1.
    S. Haykin, Signal processing: where physics and mathematics meet. IEEE Signal Process. Mag. 18, 6–7 (2001)CrossRefGoogle Scholar
  2. 2.
    B. Hassibi, A.H. Sayed, T. Kailath, Linear estimation in Krein spaces. Parts I and II. IEEE Trans. Autom. Control 41, 18–49 (1996)MathSciNetzbMATHCrossRefGoogle Scholar
  3. 3.
    B. Hassibi, A.H. Sayed, T. Kailath, \(H\,^{\infty }\) optimality of the LMS algorithm. IEEE Trans. Signal Process. 44, 267–280 (1996)CrossRefGoogle Scholar
  4. 4.
    U. Shaked, Y. Theodor, \(H\,^{\infty }\)-Optimal estimation—a tutorial. Proceedings of IEEE Conference on Decision Control, 1992, pp. 2278–2286Google Scholar
  5. 5.
    H. Rey, L. Rey Vega, S. Tressens, J. Benesty, Variable explicit regularization in affine projection algorithm: robustness issues and optimal choice. IEEE Trans. Signal Process. 55, 2096–2109 (2007)MathSciNetCrossRefGoogle Scholar
  6. 6.
    M. Rupp, A.H. Sayed, A time-domain feedback analysis of filtered-error adaptive gradient algorithms. IEEE Trans. Signal Process. 44, 1428–1439 (1996)CrossRefGoogle Scholar
  7. 7.
    T. Aboulnasr, K. Mayyas, A robust variable step-size LMS-type algorithm: Analysis and simulations. IEEE Trans. Signal Process. 45, 631–639 (1997)CrossRefGoogle Scholar
  8. 8.
    J. Benesty, H. Rey, L. Rey Vega, S. Tressens, A nonparametric VSS NLMS algorithm. IEEE Signal Process. Lett. 13, 581–584 (2006)CrossRefGoogle Scholar
  9. 9.
    C. Paleologu, J. Benesty, S. Ciochina, A variable step-size affine projection algorithm designed for acoustic echo cancellation. IEEE Trans. Audio Speech Lang. Process. 16, 1466–1478 (2008)Google Scholar
  10. 10.
    L. Rey Vega, H. Rey, J. Benesty, S. Tressens, A new robust variable step-size NLMS algorithm. IEEE Trans. Signal Process. 56, 1878–1893 (2008)MathSciNetCrossRefGoogle Scholar
  11. 11.
    L. Rey Vega, H. Rey, J. Benesty, S. Tressens, A fast robust recursive least-squares algorithm. IEEE Trans. Signal Process. 57, 1209–1216 (2009)CrossRefGoogle Scholar
  12. 12.
    L. Rey Vega, H. Rey, J. Benesty, S. Tressens, A family of robust algorithms exploiting sparsity in adaptive filters, IEEE Trans. Audio Speech Lang. Process. 17, 572–581 (2009)Google Scholar
  13. 13.
    L. Rey Vega, H. Rey, J. Benesty, A robust variable step-size affine projection algorithm. Elsevier Signal Process. 90, 2806–2810 (2010)zbMATHCrossRefGoogle Scholar
  14. 14.
    K. Mayyas, A variable step-size affine projection algorithm. Elsevier Digital Signal Process. 20, 502–510 (2010)CrossRefGoogle Scholar
  15. 15.
    H.-C. Shin, A.H. Sayed, W.-J. Song, Variable step-size NLMS and affine projection algorithms. IEEE Signal Process. Lett. 11, 132–135 (2004)CrossRefGoogle Scholar
  16. 16.
    A. Mader, H. Puder, G. Schmidt, Step-size control for acoustic echo cancellation filters. An overview. Elsevier. Signal Process. 80, 1697–1719 (2000)zbMATHCrossRefGoogle Scholar
  17. 17.
    D. Challa, S. L. Grant, A. Mohammad, Variable regularized fast affine projections. Proceedings of IEEE ICASSP, pp. I-89-I-92, 2007Google Scholar
  18. 18.
    Y. Choi, H. Shin, W. Song, Adaptive regularization matrix for affine projection algorithm. IEEE Trans. Circuits Syst. II 54, 1087–1091 (2007)CrossRefGoogle Scholar
  19. 19.
    W. Yin, A.S. Mehr, A variable regularization method for affine projection algorithm. IEEE Trans. Circuits Syst. II 57, 476–480 (2010)CrossRefGoogle Scholar
  20. 20.
    C. Breining, P. Dreiscitel, E. Hansler, A. Mader, B. Nitsch, H. Puder, T. Schertler, G. Schmidt, J. Tilp, Acoustic echo control. An application of very-high-order adaptive filters. IEEE Signal Process. Mag. 16, 42–69 (1999)Google Scholar
  21. 21.
    M.R. Raghavendra, K. Giridhar, Improving channel estimation in OFDM systems for sparse multipath channels. IEEE Signal Process. Lett. 12, 52–55 (2005)CrossRefGoogle Scholar
  22. 22.
    J. Haupt, W.U. Bajwa, G. Raz, R. Nowak, Toeplitz compressed sensing matrices with applications to sparse channel estimation. IEEE Trans. Inform. Theory. 56, 5862–5875 (2010)MathSciNetCrossRefGoogle Scholar
  23. 23.
    D.L. Duttweiler, Proportionate normalized least mean square adaptation in echo cancelers. IEEE Trans. Speech Audio Process. 8, 508–518 (2000)CrossRefGoogle Scholar
  24. 24.
    J. Homer, I. Mareels, C. Hoang, Enhanced detection-guided NLMS estimation of sparse FIR-modeled signal channels. IEEE Trans. Circuits Syst I(53), 1783–1791 (2006)Google Scholar
  25. 25.
    J. Benesty, S.L. Gay, An improved PNLMS algorithm. Proceedings of IEEE ICASSP, 2002, pp. 1881–1884Google Scholar
  26. 26.
    Y. Huang, J. Benesty, J. Chen, Acoustic MIMO Signal Processing (Springer, Berlin, 2006)zbMATHGoogle Scholar
  27. 27.
    R.K. Martin, W.A. Sethares, R.C. Williamson, C.R. Johnson Jr, Exploiting sparsity in adaptive filters. IEEE Trans. Signal Process. 50, 1883–1894 (2002)CrossRefGoogle Scholar
  28. 28.
    S. Amari, Natural gradient works efficiently in learning. Neural Computation 10, 251–276 (1998)CrossRefGoogle Scholar
  29. 29.
    R.E. Mahoney, R.C. Williamson, Prior knowledge and preferential structures in gradient descent learning algorithms. Journal Mach. Learn. Research 1, 311–355 (2001)Google Scholar
  30. 30.
    E.J. Candes, M.B. Wakin, An introduction to compressive sampling. IEEE Signal Process. Mag. 25, 21–30 (2008)CrossRefGoogle Scholar
  31. 31.
    Y. Chen, Y. Gu, A. O. Hero, Sparse LMS for system identification. Proceedings of IEEE ICASSP, pp. 3125–3128, 2009Google Scholar
  32. 32.
    Y. Gu, J. Jian, S. Mei, \(l_{0}\) norm Constraint LMS algorithm for sparse system identification. IEEE Signal Process. Lett. 16, 774–777 (2009)CrossRefGoogle Scholar
  33. 33.
    B. Babadi, N. Kalouptsidis, V. Tarokh, SPARLS: The sparse RLS algorithm. IEEE Trans. Signal Process. 58, 4013–4025 (2010)MathSciNetCrossRefGoogle Scholar
  34. 34.
    Y. Kopsinis, K. Slavakis, S. Theodoridis, Online sparse system identification and signal reconstruction using projections onto weighted \(\ell _{1}\) balls. IEEE Trans. Signal Process. 59, 936–952 (2011)MathSciNetCrossRefGoogle Scholar
  35. 35.
    P.J. Huber, Robust Statistics (John Wiley& Sons, New York, 1981)zbMATHCrossRefGoogle Scholar
  36. 36.
    T. Gänsler, S.L. Gay, M.M. Sondhi, J. Benesty, Double-talk robust fast converging algorithms for network echo cancellation. IEEE Trans. Speech Audio Process. 8, 656–663 (2000)CrossRefGoogle Scholar
  37. 37.
    P. Petrus, Robust Huber adaptive filter. IEEE Trans. Signal Process. 47, 1129–1133 (1999)CrossRefGoogle Scholar
  38. 38.
    Y. Zou, S.C. Chan, T.S. Ng, Least mean \(M\)-estimate algorithms for robust adaptive filtering in impulse noise. IEEE Trans. Circuits Syst. II(47), 1564–1569 (2000)Google Scholar
  39. 39.
    J. Chambers, A. Avlonitis, A robust mixed-norm adaptive adaptive filter algorithm. IEEE Signal Process. Lett. 4, 46–48 (1997)CrossRefGoogle Scholar
  40. 40.
    E.V. Papoulis, T. Stathaki, A normalized robust mixed-norm adaptive algorithm for system identification. IEEE Signal Process. Lett. 11, 56–59 (2004)CrossRefGoogle Scholar
  41. 41.
    A. Swami, Q. Zhao, Y. Hong, L. Tong, Wireless Sensor Networks. Signal Processing and Communication Perspectives (Wiley, West Sussex, 2007)Google Scholar
  42. 42.
    I. Akyildiz, W. Su, Y. Sankarasubramaniam, E. Cayirci, A survey on sensor networks. IEEE Commun. Mag. 40, 102–114 (2002)CrossRefGoogle Scholar
  43. 43.
    C.G. Lopes, A.H. Sayed, Incremental adaptive strategies over distributed networks. IEEE Trans. Signal Process. 55, 4064–4077 (2007)MathSciNetCrossRefGoogle Scholar
  44. 44.
    F.S. Cativelli, C.G. Lopes, A.H. Sayed, Diffusion recursive least-squares for distributed estimation over adaptive networks. IEEE Trans. Signal Process. 56, 1865–1877 (2008)MathSciNetCrossRefGoogle Scholar
  45. 45.
    C.G. Lopes, A.H. Sayed, Diffusion least-mean squares over adaptive networks: Formulation and performance analysis. IEEE Trans. Signal Process. 56, 3122–3136 (2008)MathSciNetCrossRefGoogle Scholar
  46. 46.
    I.D. Schizas G. Mateos, G.B. Giannakis, Distributed LMS for consensus-based in-network adaptive processing. IEEE Trans. Signal Process. 57, 2365–2382 (2009)Google Scholar
  47. 47.
    A. Bertrand, M. Moonen, Distributed adaptive node-specific signal estimation in fully connected sensor networks - Part I: Sequential node updating. IEEE Trans. Signal Process. 58, 5277–5291 (2010)MathSciNetCrossRefGoogle Scholar
  48. 48.
    A. Bertrand, M. Moonen, Distributed adaptive node-specific signal estimation in fully connected sensor networks - Part II: Simultaneous and asynchronous node updating. IEEE Trans. Signal Process. 58, 5292–5306 (2010)MathSciNetCrossRefGoogle Scholar
  49. 49.
    S. Chouvardas, K. Slavakis, S. Theodoridis, Adaptive robust distributed learning in diffusion sensor networks. IEEE Trans. Signal Process. 59, 4692–4707 (2011)MathSciNetCrossRefGoogle Scholar

Copyright information

© The Author(s) 2013

Authors and Affiliations

  1. 1.School of EngineeringUniversity of Buenos AiresBuenos AiresArgentina
  2. 2.Department of EngineeringUniversity of LeicesterLeicesterUK

Personalised recommendations