Skip to main content
Log in

Deep multilayer multiple kernel learning

  • Predictive Analytics Using Machine Learning
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Multiple kernel learning (MKL) approach has been proposed for kernel methods and has shown high performance for solving some real-world applications. It consists on learning the optimal kernel from one layer of multiple predefined kernels. Unfortunately, this approach is not rich enough to solve relatively complex problems. With the emergence and the success of the deep learning concept, multilayer of multiple kernel learning (MLMKL) methods were inspired by the idea of deep architecture. They are introduced in order to improve the conventional MKL methods. Such architectures tend to learn deep kernel machines by exploring the combinations of multiple kernels in a multilayer structure. However, existing MLMKL methods often have trouble with the optimization of the network for two or more layers. Additionally, they do not always outperform the simplest method of combining multiple kernels (i.e., MKL). In order to improve the effectiveness of MKL approaches, we introduce, in this paper, a novel backpropagation MLMKL framework. Specifically, we propose to optimize the network over an adaptive backpropagation algorithm. We use the gradient ascent method instead of dual objective function, or the estimation of the leave-one-out error. We test our proposed method through a large set of experiments on a variety of benchmark data sets. We have successfully optimized the system over many layers. Empirical results over an extensive set of experiments show that our algorithm achieves high performance compared to the traditional MKL approach and existing MLMKL methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. Available at: https://archive.ics.uci.edu/ml/datasets.html.

  2. Available at: http://sci2s.ugr.es/keel/category.php?cat=clas.

  3. Available at: https://sites.google.com/site/xinxingxu666/.

  4. Available at: https://github.com/ericstrobl/deepMKL.

References

  1. Bach F, Lanckriet G, Jordan M (2004) Multiple kernel learning, conic duality, and the SMO algorithm. In: International conference on machine learning, pp 1–9

  2. Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3):27

    Google Scholar 

  3. Chen G, Parada C, Heigold G (2014) Small-footprint keyword spotting using deep neural networks. In: International conference on acoustics, speech and signal processing, pp 4087–4091

  4. Cho Y (2012) Kernel methods for deep learning. Ph.D. thesis, University of California, San Diego, CA, USA

  5. Cho Y, Saul L (2009) Kernel methods for deep learning. In: Advances in neural information processing systems, pp 342–350

  6. Ciresan D, Meier U, Gambardella L, Schmidhuber J (2010) Deep big simple neural nets excel on handwritten digit recognition. In: CoRR, pp 1–14. arXiv:1003.0358

  7. Cortes C, Mohri M, Rostamizadeh A (2009) L2 regularization for learning kernels. In: Conference on uncertainty in artificial intelligence, pp 109–116

  8. Cortes C, Mohri M, Rostamizadeh A (2009) Learning non-linear combinations of kernels. In: Advances in neural information processing systems, pp 396–404

  9. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297

    MATH  Google Scholar 

  10. Cristianini N, Shawe-Taylor J (2000) Support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  11. Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error propagation. Parallel Distrib Process 1:318–362

    Google Scholar 

  12. Hu M, Chen Y, Kwok J (2009) Building sparse multiple-kernel SVM classifiers. IEEE Trans Neural Netw 20:827–839

    Article  Google Scholar 

  13. Kloft M, Brefeld U, Sonnenburg S, Laskov P, Muller KR, Zien A (2009) Efficient and accurate Lp-norm multiple kernel learning. Adv Neural Inf Process Syst 22(22):997–1005

    MATH  Google Scholar 

  14. Kloft M, Brefeld U, Sonnenburg S, Zien A (2011) Lp-norm multiple kernel learning. J Mach Learn Res 12:953–997

    MathSciNet  MATH  Google Scholar 

  15. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  16. Lanckriet G, Cristianini N, Bartlett P, Ghaoui L, Jordan MI (2004) Learning the kernel matrix with semidefinite programming. J Mach Learn Res 5:27–72

    MathSciNet  MATH  Google Scholar 

  17. Martin K, Grzl F, Hannemann M, Vesely K, Cernocky J (2013) BUT BABEL system for spontaneous Cantonese. In: Interspeech, pp 2589–2593

  18. Mika S, Scholkopf B, Smola A, Muller K, Scholz M, Ratsch G (1998) Kernel PCA and denoising in feature spaces. In: Advances in neural information processing systems, pp 536–542

  19. Rakotomamonjy A, Bach FR, Canu S, Grandvalet Y (2008) Simple MKL. J Mach Learn Res 9:2491–2512

    MathSciNet  Google Scholar 

  20. Rojas R (1996) Neural networks: a systematic introduction. Springer, New York

    Book  MATH  Google Scholar 

  21. Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  22. Sonnenburg S, Rtsch G, Schfer C, Schlkopf B (2006) Large scale multiple kernel learning. J Mach Learn Res 7:1531–1565

    MathSciNet  Google Scholar 

  23. Strobl E, Visweswaran S (2013) Deep multiple kernel learning. In: International conference on machine learning and applications, pp 414–417

  24. Varma M, Babu B (2009) More generality in efficient multiple kernel learning. In: International conference on machine learning, pp 1065–1072

  25. Vishwanathan S, Sun Z, Ampornpunt N, Varma M (2010) Multiple kernel learning and the SMO algorithm. In: Advances in neural information processing systems, pp 2361–2369

  26. Wiering MA (2013) The neural support vector machine. In: The 25th Benelux artificial intelligence conference, pp 1–8

  27. Xu X, Tsang I, Xu D (2013) Soft margin multiple kernel learning. IEEE Trans Neural Netw Learn Syst 24:749–761

    Article  Google Scholar 

  28. Xu Z, Jin R, King I, Lyu M (2009) An extended level method for efficient multiple kernel learning. In: Advances in neural information processing systems, pp 1825–1832

  29. Zhuang J, Tsang I, Hoi S (2011) Two-layer multiple kernel learning. In: International conference on artificial intelligence and statistics, pp 909–917

  30. Zien A, Ong C (2007) Multiclass multiple kernel learning. In: International conference on machine learning, pp 1191–1198

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ilyes Rebai.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rebai, I., BenAyed, Y. & Mahdi, W. Deep multilayer multiple kernel learning. Neural Comput & Applic 27, 2305–2314 (2016). https://doi.org/10.1007/s00521-015-2066-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-015-2066-x

Keywords

Navigation