Abstract
Since the amount of information is rapidly growing, there is an overwhelming interest in efficient network computing systems including Grids, public-resource computing systems, P2P systems and cloud computing. In this paper we take a detailed look at the problem of modeling and optimization of network computing systems for parallel decision tree induction methods. Firstly, we present a comprehensive discussion on mentioned induction methods with a special focus on their parallel versions. Next, we propose a generic optimization model of a network computing system that can be used for distributed implementation of parallel decision trees. To illustrate our work we provide results of numerical experiments showing that the distributed approach enables significant improvement of the system throughput.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Akbari, B., Rabiee, H., Ghanbari, M.: An optimal discrete rate allocation for overlay video multicasting. Computer Communications (31), 551–562 (2008)
Alpaydin, E.: Introduction to Machine Learning. The MIT Press, London (2004)
Ben-Haim, Y., Yom-Tov, E.: A streaming parallel decision tree algorithm. In: The Proc. of ICML 2008 Workshop PASCAL Large Scale Learning Challenge (2008)
Foster, I., Iamnitchi, A.: On Death, Taxes and the Convergence of Peer-to-Peer and Grid Computing. In: Kaashoek, M.F., Stoica, I. (eds.) IPTPS 2003. LNCS, vol. 2735, pp. 118–128. Springer, Heidelberg (2003)
ILOG CPLEX 11.0 User’s Manual, France (2007)
Jin, R., Agrawal, G.: Communication and memory efficient parallel decision tree construction. In: The Proc. of the 3rd SIAM Conference on Data Mining, pp. 119–129 (2003)
Jin, R., Agrawal, G.: Efficient Decision Tree Construction on Streaming Data. In: Proc. of the 9th ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), Washington D.C, pp. 571–576 (2003)
Kufrin, R.: Decision trees on parallel processors. In: Geller, J., Kitano, H., Suttner, C.B. (eds.) Parallel Processing for Artificial Intelligence, vol. 3, pp. 279–306. Elsevier Science, Amsterdam (1997)
Mehta, M., et al.: SLIQ: A fast scalable classifier for data mining. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 18–32. Springer, Heidelberg (1996)
Nabrzyski, J., Schopf, J., Węglarz, J. (eds.): Grid resource management: state of the art and future trends. Kluwer Academic Publishers, Boston (2004)
Paliouras, G., Bree, D.S.Ł.: The effect of numeric features on the scalability of inductive learning programs. In: Lavrač, N., Wrobel, S. (eds.) ECML 1995. LNCS, vol. 912, pp. 218–231. Springer, Heidelberg (1995)
Quinlan, J.R.: Induction on Decision Tree. Machine Learning 1, 81–106 (1986)
Quinlan, J.R.: C4.5: Program for Machine Learning. Morgan Kaufman, San Mateo (1993)
Shafer, J., et al.: SPRINT: A scalable parallel classifier for data mining. In: The Proc. of the 22nd VLBD Conference, pp. 544–555 (1996)
Srivastava, A., et al.: Parallel formulations of decision tree classification algorithms. Data Mining and Knowledge Discovery 3(3), 237–261 (1999)
Taylor, I.: From P2P to Web services and grids: peers in a client/server world. Springer, Heidelberg (2005)
Wu, G., Tzi-cker, C.: Peer to Peer File Download and Streaming. RPE TR-185 (2005)
Yang, C.-T., Tsai, S.T., Li, K.-C.: Decision Tree Construction for Data Mining on Grid Computing Environments. In: Proc. of the 19th International Conference on Advanced Information Networking and Applications AINA 2005, Taipei, Taiwan, pp. 421–424 (2005)
Yidiz, O.T., Dikmen, O.: Parallel univariate decision trees. Pattern Recognition Letters 28(7), 825–832 (2007)
Zhu, Y., Li, B.: Overlay Networks with Linear Capacity Constraints. IEEE Transactions on Parallel and Distributed Systems 19(2), 159–173 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Walkowiak, K., Woźniak, M. (2009). Modeling of Network Computing Systems for Decision Tree Induction Tasks. In: Corchado, E., Yin, H. (eds) Intelligent Data Engineering and Automated Learning - IDEAL 2009. IDEAL 2009. Lecture Notes in Computer Science, vol 5788. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04394-9_93
Download citation
DOI: https://doi.org/10.1007/978-3-642-04394-9_93
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04393-2
Online ISBN: 978-3-642-04394-9
eBook Packages: Computer ScienceComputer Science (R0)