Modeling of Network Computing Systems for Decision Tree Induction Tasks

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5788)


Since the amount of information is rapidly growing, there is an overwhelming interest in efficient network computing systems including Grids, public-resource computing systems, P2P systems and cloud computing. In this paper we take a detailed look at the problem of modeling and optimization of network computing systems for parallel decision tree induction methods. Firstly, we present a comprehensive discussion on mentioned induction methods with a special focus on their parallel versions. Next, we propose a generic optimization model of a network computing system that can be used for distributed implementation of parallel decision trees. To illustrate our work we provide results of numerical experiments showing that the distributed approach enables significant improvement of the system throughput.


Machine Learning Network Computing Grids Modeling Optimization Parallel Decision Tree 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Akbari, B., Rabiee, H., Ghanbari, M.: An optimal discrete rate allocation for overlay video multicasting. Computer Communications (31), 551–562 (2008)Google Scholar
  2. 2.
    Alpaydin, E.: Introduction to Machine Learning. The MIT Press, London (2004)zbMATHGoogle Scholar
  3. 3.
    Ben-Haim, Y., Yom-Tov, E.: A streaming parallel decision tree algorithm. In: The Proc. of ICML 2008 Workshop PASCAL Large Scale Learning Challenge (2008)Google Scholar
  4. 4.
    Foster, I., Iamnitchi, A.: On Death, Taxes and the Convergence of Peer-to-Peer and Grid Computing. In: Kaashoek, M.F., Stoica, I. (eds.) IPTPS 2003. LNCS, vol. 2735, pp. 118–128. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  5. 5.
    ILOG CPLEX 11.0 User’s Manual, France (2007)Google Scholar
  6. 6.
    Jin, R., Agrawal, G.: Communication and memory efficient parallel decision tree construction. In: The Proc. of the 3rd SIAM Conference on Data Mining, pp. 119–129 (2003)Google Scholar
  7. 7.
    Jin, R., Agrawal, G.: Efficient Decision Tree Construction on Streaming Data. In: Proc. of the 9th ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), Washington D.C, pp. 571–576 (2003)Google Scholar
  8. 8.
    Kufrin, R.: Decision trees on parallel processors. In: Geller, J., Kitano, H., Suttner, C.B. (eds.) Parallel Processing for Artificial Intelligence, vol. 3, pp. 279–306. Elsevier Science, Amsterdam (1997)CrossRefGoogle Scholar
  9. 9.
    Mehta, M., et al.: SLIQ: A fast scalable classifier for data mining. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 18–32. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  10. 10.
    Nabrzyski, J., Schopf, J., Węglarz, J. (eds.): Grid resource management: state of the art and future trends. Kluwer Academic Publishers, Boston (2004)zbMATHGoogle Scholar
  11. 11.
    Paliouras, G., Bree, D.S.Ł.: The effect of numeric features on the scalability of inductive learning programs. In: Lavrač, N., Wrobel, S. (eds.) ECML 1995. LNCS, vol. 912, pp. 218–231. Springer, Heidelberg (1995)CrossRefGoogle Scholar
  12. 12.
    Quinlan, J.R.: Induction on Decision Tree. Machine Learning 1, 81–106 (1986)Google Scholar
  13. 13.
    Quinlan, J.R.: C4.5: Program for Machine Learning. Morgan Kaufman, San Mateo (1993)Google Scholar
  14. 14.
    Shafer, J., et al.: SPRINT: A scalable parallel classifier for data mining. In: The Proc. of the 22nd VLBD Conference, pp. 544–555 (1996)Google Scholar
  15. 15.
    Srivastava, A., et al.: Parallel formulations of decision tree classification algorithms. Data Mining and Knowledge Discovery 3(3), 237–261 (1999)CrossRefGoogle Scholar
  16. 16.
    Taylor, I.: From P2P to Web services and grids: peers in a client/server world. Springer, Heidelberg (2005)zbMATHGoogle Scholar
  17. 17.
    Wu, G., Tzi-cker, C.: Peer to Peer File Download and Streaming. RPE TR-185 (2005)Google Scholar
  18. 18.
    Yang, C.-T., Tsai, S.T., Li, K.-C.: Decision Tree Construction for Data Mining on Grid Computing Environments. In: Proc. of the 19th International Conference on Advanced Information Networking and Applications AINA 2005, Taipei, Taiwan, pp. 421–424 (2005)Google Scholar
  19. 19.
    Yidiz, O.T., Dikmen, O.: Parallel univariate decision trees. Pattern Recognition Letters 28(7), 825–832 (2007)CrossRefGoogle Scholar
  20. 20.
    Zhu, Y., Li, B.: Overlay Networks with Linear Capacity Constraints. IEEE Transactions on Parallel and Distributed Systems 19(2), 159–173 (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  1. 1.Chair of Systems and Computer Networks, Faculty of ElectronicsWroclaw University of TechnologyWroclawPoland

Personalised recommendations