Skip to main content

Decision Tree Using Local Support Vector Regression for Large Datasets

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10751))

Abstract

Our proposed decision tree using local support vector regression models (tSVR) is to handle the regression task of large datasets. The learning algorithm tSVR of regression models is done by two main steps. The first one is to construct a decision tree regressor for partitioning the full training dataset into k terminal-nodes (subsets), followed which the second one is to learn the SVR model from each terminal-node to predict the data locally in the parallel way on multi-core computers. The tSVR algorithm is faster than the standard SVR in training the non-linear regression model from large datasets while maintaining the high correctness in the prediction. The numerical test results on datasets from UCI repository showed that the proposed tSVR is efficient compared to the standard SVR.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    It remarks that the complexity analysis of the tSVR excepts the tree regressor learnt to split the full dataset. However this training the tree regressor has the very low computational cost compared with the quadratic programming solution required by the SVR learning algorithm.

References

  1. Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995). https://doi.org/10.1007/978-1-4757-3264-1

    Book  MATH  Google Scholar 

  2. Guyon, I.: Web page on SVM applications (1999). http://www.clopinet.com/isabelle/Projects/-SVM/app-list.html

  3. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.: Classification and Regression Trees. Wadsworth International, Belmont (1984)

    MATH  Google Scholar 

  4. Lichman, M.: UCI machine learning repository (2013)

    Google Scholar 

  5. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines: and Other Kernel-Based Learning Methods. Cambridge University Press, New York (2000)

    Book  MATH  Google Scholar 

  6. Platt, J.: Fast training of support vector machines using sequential minimal optimization. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods Support Vector Learning, pp. 185–208 (1999)

    Google Scholar 

  7. OpenMP Architecture Review Board: OpenMP application program interface V3.0 (2008)

    Google Scholar 

  8. Vapnik, V.: Principles of risk minimization for learning theory. In: Advances in Neural Information Processing Systems 4, NIPS Conference, Denver, Colorado, USA, 2–5 December 1991, pp. 831–838 (1991)

    Google Scholar 

  9. Bottou, L., Vapnik, V.: Local learning algorithms. Neural Comput. 4(6), 888–900 (1992)

    Article  Google Scholar 

  10. Vapnik, V., Bottou, L.: Local algorithms for pattern recognition and dependencies estimation. Neural Comput. 5(6), 893–909 (1993)

    Article  Google Scholar 

  11. Do, T., Poulet, F.: Parallel learning of local SVM algorithms for classifying large datasets. T. Large-Scale Data-Knowl.-Cent. Syst. 31, 67–93 (2016)

    Google Scholar 

  12. Chang, C.C., Lin, C.J.: LIBSVM : a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(27), 1–27 (2011)

    Article  Google Scholar 

  13. Lin, C.: A practical guide to support vector classification (2003)

    Google Scholar 

  14. Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E.: Adaptive mixtures of local experts. Neural Comput. 3(1), 79–87 (1991)

    Article  Google Scholar 

  15. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. B 39(1), 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  16. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)

    MATH  Google Scholar 

  17. Collobert, R., Bengio, S., Bengio, Y.: A parallel mixture of SVMs for very large scale problems. Neural Comput. 14(5), 1105–1114 (2002)

    Article  MATH  Google Scholar 

  18. Do, T., Poulet, F.: Classifying very high-dimensional and large-scale multi-class image datasets with latent-LSVM. In: 2016 International IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld), Toulouse, France, July 18–21, 2016, pp. 714–721 (2016)

    Google Scholar 

  19. Do, T., Poulet, F.: Latent-LSVM classification of very high-dimensional and large-scale multi-class datasets. Concurr. Comput.: Pract. Exp. e4224–n/a

    Google Scholar 

  20. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  21. Gu, Q., Han, J.: Clustered support vector machines. In: Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2013, Scottsdale, AZ, USA, 29 April–1 May 2013, vol. 31, pp. 307–315. JMLR Proceedings (2013)

    Google Scholar 

  22. Bui, L.-D., Tran-Nguyen, M.-T., Kim, Y.-G., Do, T.-N.: Parallel algorithm of local support vector regression for large datasets. In: Dang, T.K., Wagner, R., Küng, J., Thoai, N., Takizawa, M., Neuhold, E.J. (eds.) FDSE 2017. LNCS, vol. 10646, pp. 139–153. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70004-5_10

    Chapter  Google Scholar 

  23. Do, T.-N.: Non-linear classification of massive datasets with a parallel algorithm of local support vector machines. In: Le Thi, H.A., Nguyen, N.T., Do, T.V. (eds.) Advanced Computational Methods for Knowledge Engineering. AISC, vol. 358, pp. 231–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-17996-4_21

    Chapter  Google Scholar 

  24. MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press, Berkeley, January 1967

    Google Scholar 

  25. Do, T.-N., Poulet, F.: Random local SVMs for classifying large datasets. In: Dang, T.K., Wagner, R., Küng, J., Thoai, N., Takizawa, M., Neuhold, E. (eds.) FDSE 2015. LNCS, vol. 9446, pp. 3–15. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-26135-5_1

    Chapter  Google Scholar 

  26. Chang, F., Guo, C.Y., Lin, X.R., Lu, C.J.: Tree decomposition for large-scale SVM problems. J. Mach. Learn. Res. 11, 2935–2972 (2010)

    MathSciNet  MATH  Google Scholar 

  27. Chang, F., Liu, C.C.: Decision tree as an accelerator for support vector machines. In: Ding, X., (ed.) Advances in Character Recognition. InTech (2012)

    Google Scholar 

  28. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, Burlington (1993)

    Google Scholar 

  29. Vincent, P., Bengio, Y.: K-local hyperplane and convex distance nearest neighbor algorithms. In: Advances in Neural Information Processing Systems, pp. 985–992. The MIT Press (2001)

    Google Scholar 

  30. Zhang, H., Berg, A., Maire, M., Malik, J.: SVM-KNN: discriminative nearest neighbor classification for visual category recognition. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, pp. 2126–2136 (2006)

    Google Scholar 

  31. Yang, T., Kecman, V.: Adaptive local hyperplane classification. Neurocomputing 71(1315), 3001–3004 (2008)

    Article  Google Scholar 

  32. Segata, N., Blanzieri, E.: Fast and scalable local kernel machines. J. Mach. Learn. Res. 11, 1883–1926 (2010)

    MathSciNet  MATH  Google Scholar 

  33. Beygelzimer, A., Kakade, S., Langford, J.: Cover trees for nearest neighbor. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 97–104. ACM (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thanh-Nghi Do .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tran-Nguyen, MT., Bui, LD., Kim, YG., Do, TN. (2018). Decision Tree Using Local Support Vector Regression for Large Datasets. In: Nguyen, N., Hoang, D., Hong, TP., Pham, H., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2018. Lecture Notes in Computer Science(), vol 10751. Springer, Cham. https://doi.org/10.1007/978-3-319-75417-8_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-75417-8_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-75416-1

  • Online ISBN: 978-3-319-75417-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics