Re-adapting the Regularization of Weights for Non-stationary Regression

Vaits, Nina; Crammer, Koby

doi:10.1007/978-3-642-24412-4_12

Re-adapting the Regularization of Weights for Non-stationary Regression

Nina Vaits²² &
Koby Crammer²²

Conference paper

2775 Accesses
5 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6925))

Abstract

The goal of a learner in standard online learning is to have the cumulative loss not much larger compared with the best-performing prediction-function from some fixed class. Numerous algorithms were shown to have this gap arbitrarily close to zero compared with the best function that is chosen off-line. Nevertheless, many real-world applications (such as adaptive filtering) are non-stationary in nature and the best prediction function may not be fixed but drift over time. We introduce a new algorithm for regression that uses per-feature-learning rate and provide a regret bound with respect to the best sequence of functions with drift. We show that as long as the cumulative drift is sub-linear in the length of the sequence our algorithm suffers a regret that is sub-linear as well. We also sketch an algorithm that achieves the best of the two worlds: in the stationary settings has log(T) regret, while in the non-stationary settings has sub-linear regret. Simulations demonstrate the usefulness of our algorithm compared with other state-of-the-art approaches.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Auer, P., Warmuth, M.K.: Tracking the best disjunction. Electronic Colloquium on Computational Complexity (ECCC) 7(70) (2000)
Google Scholar
Bershad, N.J.: Analysis of the normalized lms algorithm with gaussian inputs. IEEE Transactions on Acoustics, Speech, and Signal Processing 34(4), 793–806 (1986)
Article Google Scholar
Cavallanti, G., Cesa-Bianchi, N., Gentile, C.: Tracking the best hyperplane with a simple budget perceptron. Machine Learning 69(2-3), 143–167 (2007)
Article Google Scholar
Ceas-Bianchi, N., Long, P.M., Warmuth, M.K.: Worst case quadratic loss bounds for on-line prediction of linear functions by gradient descent. Technical Report IR-418, University of California, Santa Cruz, CA, USA (1993)
Google Scholar
Cesa-Bianchi, N., Conconi, A., Gentile, C.: A second-order perceptron algorithm. Siam Journal of Commutation 34(3), 640–668 (2005)
Article MathSciNet MATH Google Scholar
Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, New York (2006)
Book MATH Google Scholar
Chen, M.-S., Yen, J.-Y.: Application of the least squares algorithm to the observer design for linear time-varying systems. IEEE Transactions on Automatic Control 44(9), 1742–1745 (1999)
Article MathSciNet MATH Google Scholar
Crammer, K., Dredze, M., Pereira, F.: Exact confidence-weighted learning. In: NIPS, vol. 22 (2008)
Google Scholar
Crammer, K., Kulesza, A., Dredze, M.: Adaptive regularization of weighted vectors. In: Advances in Neural Information Processing Systems, vol. 23 (2009)
Google Scholar
Dredze, M., Crammer, K., Pereira, F.: Confidence-weighted linear classification. In: ICML (2008)
Google Scholar
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. In: COLT, pp. 257–269 (2010)
Google Scholar
Feuer, A., Weinstein, E.: Convergence analysis of lms filters with uncorrelated Gaussian data. IEEE Transactions on Acoustics, Speech, and Signal Processing 33(1), 222–230 (1985)
Article Google Scholar
Forster, J.: On relative loss bounds in generalized linear regression. In: Ciobanu, G., Păun, G. (eds.) FCT 1999. LNCS, vol. 1684, pp. 269–280. Springer, Heidelberg (1999)
Chapter Google Scholar
Foster, D.P.: Prediction in the worst case. The Annals of Statistics 19(2), 1084–1090 (1991)
Article MathSciNet MATH Google Scholar
Golub, G.H., Van Loan, C.F.: Matrix computations, 3rd edn. Johns Hopkins University Press, Baltimore (1996)
MATH Google Scholar
Goodhart, S.G., Burnham, K.J., James, D.J.G.: Logical covariance matrix reset in self-tuning control. Mechatronics 1(3), 339–351 (1991)
Article Google Scholar
Goodwin, G.C., Teoh, E.K., Elliott, H.: Deterministic convergence of a self-tuning regulator with covariance resetting. Control Theory and App., IEE Proc. D 130(1), 6–8 (1983)
Article MathSciNet MATH Google Scholar
Vovk, V.G.: Aggregating strategies. In: Proceedings of the Third Annual Workshop on Computational Learning Theory, pp. 371–383. Morgan Kaufmann, San Francisco (1990)
Google Scholar
Hayes, M.H.: 9.4: Recursive least squares. In: Statistical Digital Signal Processing and Modeling, p. 541 (1996)
Google Scholar
Herbster, M., Warmuth, M.K.: Tracking the best linear predictor. Journal of Machine Learning Research 1, 281–309 (2001)
MathSciNet MATH Google Scholar
Itmead, R.R., Anderson, B.D.O.: Performance of adaptive estimation algorithms in dependent random environments. IEEE Transactions on Automatic Control 25, 788–794 (1980)
Article MATH Google Scholar
Kivinen, J., Warmuth, M.K.: Exponential gradient versus gradient descent for linear predictors. Information and Computation 132, 132–163 (1997)
Article MATH Google Scholar
Kivinen, J., Smola, A.J., Williamson, R.C.: Online learning with kernels. In: NIPS, pp. 785–792 (2001)
Google Scholar
Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Inf. Comput. 108(2), 212–261 (1994)
Article MathSciNet MATH Google Scholar
McMahan, H.B., Streeter, M.J.: Adaptive bound optimization for online convex optimization. In: COLT, pp. 244–256 (2010)
Google Scholar
Salgado, M.E., Goodwin, G.C., Middleton, R.H.: Modified least squares algorithm incorporating exponential resetting and forgetting. International Journal of Control 47(2), 477–491 (1988)
Article MATH Google Scholar
Song, H.-S., Nam, K., Mutschler, P.: Very fast phase angle estimation algorithm for a single-phase system having sudden phase angle jumps. In: Industry Applications Conference. 37th IAS Annual Meeting, vol. 2, pp. 925–931 (2002)
Google Scholar
Widrow, B., Hoff Jr., M.E.: Adaptive switching circuits (1960)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engneering, The Technion, Haifa, Israel
Nina Vaits & Koby Crammer

Authors

Nina Vaits
View author publications
You can also search for this author in PubMed Google Scholar
Koby Crammer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Helsinki, (Gustaf Hällströmin katu 2b), P.O. Box 68, 00014, Helsinki, Finland
Jyrki Kivinen & Esko Ukkonen &
Department of Computing Science, University of Alberta, T6G 2E8, Edmonton, AB, Canada
Csaba Szepesvári
Division of Computer Science, Hokkaido University, N-14, W-9, 060-0814, Sapporo, Japan
Thomas Zeugmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vaits, N., Crammer, K. (2011). Re-adapting the Regularization of Weights for Non-stationary Regression. In: Kivinen, J., Szepesvári, C., Ukkonen, E., Zeugmann, T. (eds) Algorithmic Learning Theory. ALT 2011. Lecture Notes in Computer Science(), vol 6925. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24412-4_12

Download citation

DOI: https://doi.org/10.1007/978-3-642-24412-4_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24411-7
Online ISBN: 978-3-642-24412-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics