Higher Order Fused Regularization for Supervised Learning with Grouped Parameters

Takeuchi, Koh; Kawahara, Yoshinobu; Iwata, Tomoharu

doi:10.1007/978-3-319-23528-8_36

Koh Takeuchi¹⁰,
Yoshinobu Kawahara¹¹ &
Tomoharu Iwata¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9284))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

4714 Accesses
1 Citations

Abstract

We often encounter situations in supervised learning where there exist possibly groups that consist of more than two parameters. For example, we might work on parameters that correspond to words expressing the same meaning, music pieces in the same genre, and books released in the same year. Based on such auxiliary information, we could suppose that parameters in a group have similar roles in a problem and similar values. In this paper, we propose the Higher Order Fused (HOF) regularization that can incorporate smoothness among parameters with group structures as prior knowledge in supervised learning. We define the HOF penalty as the Lovász extension of a submodular higher-order potential function, which encourages parameters in a group to take similar estimated values when used as a regularizer. Moreover, we develop an efficient network flow algorithm for calculating the proximity operator for the regularized problem. We investigate the empirical performance of the proposed algorithm by using synthetic and real-world data.

Download to read the full chapter text

Chapter PDF

Pointwise manifold regularization for semi-supervised learning

Article 13 August 2020

Weakly supervised nonnegative matrix factorization for user-driven clustering

Article 05 October 2014

Convergence and sparsity of Lasso and group Lasso in high-dimensional generalized linear models

Article 03 July 2014

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Bach, F.R.: Structured sparsity-inducing norms through submodular functions. In: Proc. of NIPS, pp. 118–126 (2010)
Google Scholar
Bach, F.R.: Shaping level sets with submodular functions. In: Proc. of NIPS, pp. 10–18 (2011)
Google Scholar
Bach, F.R., Jenatton, R., Mairal, J., Obozinski, G.: Structured sparsity through convex optimization. Statistical Science 27(4), 450–468 (2012)
Article MathSciNet Google Scholar
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences 2(1), 183–202 (2009)
Article MathSciNet MATH Google Scholar
Chaux, C., Combettes, P.L., Pesquet, J.C., Wajs, V.R.: A variational formulation for frame-based inverse problems. Inverse Problems 23(4), 1495 (2007)
Article MathSciNet MATH Google Scholar
Combettes, P.L., Pesquet, J.C.: Proximal splitting methods in signal processing. In: Fixed-Point Algorithms for Inverse Problems in Science and Engineering, pp. 185–212. Springer (2011)
Google Scholar
Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward-backward splitting. Multiscale Modeling & Simulation 4(4), 1168–1200 (2005)
Article MathSciNet MATH Google Scholar
Edmonds, J.: Submodular functions, matroids, and certain polyhedra. In: Combinatorial Structures and their Applications, pp. 69–87 (1970)
Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: A note on the group lasso and a sparse group lasso. arXiv preprint arXiv:1001.0736 (2010)
Friedman, J., Hastie, T., Höfling, H., Tibshirani, R., et al.: Pathwise coordinate optimization. The Annals of Applied Statistics 1(2), 302–332 (2007)
Article MathSciNet MATH Google Scholar
Fujishige, S.: Submodular functions and optimization, vol. 58. Elsevier (2005)
Google Scholar
Fujishige, S., Hayashi, T., Isotani, S.: The minimum-norm-point algorithm applied to submodular function minimization and linear programming. Technical report, Research Institute for Mathematical Sciences Preprint RIMS-1571, Kyoto University, Kyoto, Japan (2006)
Google Scholar
Fujishige, S., Patkar, S.B.: Realization of set functions as cut functions of graphs and hypergraphs. Discrete Mathematics 226(1), 199–210 (2001)
Article MathSciNet MATH Google Scholar
Gallo, G., Grigoriadis, M.D., Tarjan, R.E.: A fast parametric maximum flow algorithm and applications. SIAM Journal on Computing 18(1), 30–55 (1989)
Article MathSciNet MATH Google Scholar
Jacob, L., Obozinski, G., Vert, J.P.: Group lasso with overlap and graph lasso. In: Proc. of ICML, pp. 433–440 (2009)
Google Scholar
Jenatton, R., Audibert, J.Y., Bach, F.: Structured variable selection with sparsity-inducing norms. The Journal of Machine Learning Research 12, 2777–2824 (2011)
MathSciNet MATH Google Scholar
Koh, K., Kim, S.J., Boyd, S.P.: An interior-point method for large-scale l1-regularized logistic regression. Journal of Machine Learning Research 8(8), 1519–1555 (2007)
MathSciNet MATH Google Scholar
Kohli, P., Ladicky, L., Torr, P.H.S.: Robust higher order potentials for enforcing label consistency. International Journal of Computer Vision 82(3), 302–324 (2009)
Article Google Scholar
Kolmogorov, V., Zabin, R.: What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence 26(2), 147–159 (2004)
Article Google Scholar
Liu, J., Ji, S., Ye, J.: SLEP: Sparse Learning with Efficient Projections. Arizona State University (2009)
Google Scholar
Lovász, L.: Submodular functions and convexity. In: Mathematical Programming the State of the Art, pp. 235–257. Springer (1983)
Google Scholar
Moreau, J.J.: Fonctions convexes duales et points proximaux dans un espace hilbertien. CR Acad. Sci. Paris Sér. A Math. 255, 2897–2899 (1962)
Google Scholar
Nagano, K., Kawahara, Y.: Structured convex optimization under submodular constraints. In: Proc. of UAI, pp. 459–468 (2013)
Google Scholar
Nagano, K., Kawahara, Y., Aihara, K.: Size-constrained submodular minimization through minimum norm base. In: Proc. of ICML, pp. 977–984 (2011)
Google Scholar
Nesterov, Y.E.: A method of solving a convex programming problem with convergence rate \({O}(1/k^2)\). Soviet Mathematics Doklady 27, 372–376 (1983)
MATH Google Scholar
Nesterov, Y.E.: Smooth minimization of non-smooth functions. Mathematical Programming 103(1), 127–152 (2005)
Article MathSciNet MATH Google Scholar
Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenomena 60(1), 259–268 (1992)
Article MathSciNet MATH Google Scholar
Suykens, J.A., Vandewalle, J.: Least squares support vector machine classifiers. Neural Processing Letters 9(3), 293–300 (1999)
Article MathSciNet Google Scholar
Takamura, H., Inui, T., Okumura, M.: Extracting semantic orientations of words using spin model. In: Proc. of ACL, pp. 133–140. Association for Computational Linguistics (2005)
Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 267–288 (1996)
Google Scholar
Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., Knight, K.: Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67(1), 91–108 (2005)
Article MathSciNet MATH Google Scholar
Xin, B., Kawahara, Y., Wang, Y., Gao, W.: Efficient generalized fused lasso with its application to the diagnosis of alzheimers disease. In: Proc. of AAAI, pp. 2163–2169 (2014)
Google Scholar
Yuan, L., Liu, J., Ye, J.: Efficient methods for overlapping group lasso. In: Proc. of NIPS, pp. 352–360 (2011)
Google Scholar
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1), 49–67 (2006)
Article MathSciNet MATH Google Scholar
Zhang, X., Burger, M., Osher, S.: A unified primal-dual algorithm framework based on bregman iteration. Journal of Scientific Computing 46(1), 20–46 (2011)
Article MathSciNet MATH Google Scholar
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67(2), 301–320 (2005)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

NTT Communication Science Laboratories, Kyoto, Japan
Koh Takeuchi & Tomoharu Iwata
The Institute of Scientific and Industrial Research (ISIR), Osaka University, Osaka, Japan
Yoshinobu Kawahara

Authors

Koh Takeuchi
View author publications
You can also search for this author in PubMed Google Scholar
Yoshinobu Kawahara
View author publications
You can also search for this author in PubMed Google Scholar
Tomoharu Iwata
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Koh Takeuchi .

Editor information

Editors and Affiliations

University of Bari Aldo Moro, Bari, Italy
Annalisa Appice
University of Porto, Porto, Portugal
Pedro Pereira Rodrigues
University of Porto - CRACS/INESC TEC, Porto, Portugal
Vítor Santos Costa
University of Porto - INESC TEC, Porto, Portugal
Carlos Soares
University of Porto - INESC TEC, Porto, Portugal
João Gama
University of Porto - INESC TEC, Porto, Portugal
Alípio Jorge

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Takeuchi, K., Kawahara, Y., Iwata, T. (2015). Higher Order Fused Regularization for Supervised Learning with Grouped Parameters. In: Appice, A., Rodrigues, P., Santos Costa, V., Soares, C., Gama, J., Jorge, A. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2015. Lecture Notes in Computer Science(), vol 9284. Springer, Cham. https://doi.org/10.1007/978-3-319-23528-8_36

Download citation

DOI: https://doi.org/10.1007/978-3-319-23528-8_36
Published: 29 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23527-1
Online ISBN: 978-3-319-23528-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Higher Order Fused Regularization for Supervised Learning with Grouped Parameters

Abstract

Chapter PDF

Similar content being viewed by others

Pointwise manifold regularization for semi-supervised learning

Weakly supervised nonnegative matrix factorization for user-driven clustering

Convergence and sparsity of Lasso and group Lasso in high-dimensional generalized linear models

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Higher Order Fused Regularization for Supervised Learning with Grouped Parameters

Abstract

Chapter PDF

Similar content being viewed by others

Pointwise manifold regularization for semi-supervised learning

Weakly supervised nonnegative matrix factorization for user-driven clustering

Convergence and sparsity of Lasso and group Lasso in high-dimensional generalized linear models

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation