Skip to main content

Modeling of Network Computing Systems for Decision Tree Induction Tasks

  • Conference paper
Intelligent Data Engineering and Automated Learning - IDEAL 2009 (IDEAL 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5788))

Abstract

Since the amount of information is rapidly growing, there is an overwhelming interest in efficient network computing systems including Grids, public-resource computing systems, P2P systems and cloud computing. In this paper we take a detailed look at the problem of modeling and optimization of network computing systems for parallel decision tree induction methods. Firstly, we present a comprehensive discussion on mentioned induction methods with a special focus on their parallel versions. Next, we propose a generic optimization model of a network computing system that can be used for distributed implementation of parallel decision trees. To illustrate our work we provide results of numerical experiments showing that the distributed approach enables significant improvement of the system throughput.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Akbari, B., Rabiee, H., Ghanbari, M.: An optimal discrete rate allocation for overlay video multicasting. Computer Communications (31), 551–562 (2008)

    Google Scholar 

  2. Alpaydin, E.: Introduction to Machine Learning. The MIT Press, London (2004)

    MATH  Google Scholar 

  3. Ben-Haim, Y., Yom-Tov, E.: A streaming parallel decision tree algorithm. In: The Proc. of ICML 2008 Workshop PASCAL Large Scale Learning Challenge (2008)

    Google Scholar 

  4. Foster, I., Iamnitchi, A.: On Death, Taxes and the Convergence of Peer-to-Peer and Grid Computing. In: Kaashoek, M.F., Stoica, I. (eds.) IPTPS 2003. LNCS, vol. 2735, pp. 118–128. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  5. ILOG CPLEX 11.0 User’s Manual, France (2007)

    Google Scholar 

  6. Jin, R., Agrawal, G.: Communication and memory efficient parallel decision tree construction. In: The Proc. of the 3rd SIAM Conference on Data Mining, pp. 119–129 (2003)

    Google Scholar 

  7. Jin, R., Agrawal, G.: Efficient Decision Tree Construction on Streaming Data. In: Proc. of the 9th ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), Washington D.C, pp. 571–576 (2003)

    Google Scholar 

  8. Kufrin, R.: Decision trees on parallel processors. In: Geller, J., Kitano, H., Suttner, C.B. (eds.) Parallel Processing for Artificial Intelligence, vol. 3, pp. 279–306. Elsevier Science, Amsterdam (1997)

    Chapter  Google Scholar 

  9. Mehta, M., et al.: SLIQ: A fast scalable classifier for data mining. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 18–32. Springer, Heidelberg (1996)

    Chapter  Google Scholar 

  10. Nabrzyski, J., Schopf, J., Węglarz, J. (eds.): Grid resource management: state of the art and future trends. Kluwer Academic Publishers, Boston (2004)

    MATH  Google Scholar 

  11. Paliouras, G., Bree, D.S.Ł.: The effect of numeric features on the scalability of inductive learning programs. In: Lavrač, N., Wrobel, S. (eds.) ECML 1995. LNCS, vol. 912, pp. 218–231. Springer, Heidelberg (1995)

    Chapter  Google Scholar 

  12. Quinlan, J.R.: Induction on Decision Tree. Machine Learning 1, 81–106 (1986)

    Google Scholar 

  13. Quinlan, J.R.: C4.5: Program for Machine Learning. Morgan Kaufman, San Mateo (1993)

    Google Scholar 

  14. Shafer, J., et al.: SPRINT: A scalable parallel classifier for data mining. In: The Proc. of the 22nd VLBD Conference, pp. 544–555 (1996)

    Google Scholar 

  15. Srivastava, A., et al.: Parallel formulations of decision tree classification algorithms. Data Mining and Knowledge Discovery 3(3), 237–261 (1999)

    Article  Google Scholar 

  16. Taylor, I.: From P2P to Web services and grids: peers in a client/server world. Springer, Heidelberg (2005)

    MATH  Google Scholar 

  17. Wu, G., Tzi-cker, C.: Peer to Peer File Download and Streaming. RPE TR-185 (2005)

    Google Scholar 

  18. Yang, C.-T., Tsai, S.T., Li, K.-C.: Decision Tree Construction for Data Mining on Grid Computing Environments. In: Proc. of the 19th International Conference on Advanced Information Networking and Applications AINA 2005, Taipei, Taiwan, pp. 421–424 (2005)

    Google Scholar 

  19. Yidiz, O.T., Dikmen, O.: Parallel univariate decision trees. Pattern Recognition Letters 28(7), 825–832 (2007)

    Article  Google Scholar 

  20. Zhu, Y., Li, B.: Overlay Networks with Linear Capacity Constraints. IEEE Transactions on Parallel and Distributed Systems 19(2), 159–173 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Walkowiak, K., Woźniak, M. (2009). Modeling of Network Computing Systems for Decision Tree Induction Tasks. In: Corchado, E., Yin, H. (eds) Intelligent Data Engineering and Automated Learning - IDEAL 2009. IDEAL 2009. Lecture Notes in Computer Science, vol 5788. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04394-9_93

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04394-9_93

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04393-2

  • Online ISBN: 978-3-642-04394-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics