Abstract
In this paper, we propose a new approach to apply meta-learning concept to distributed data mining. We name this approach Knowledge Probing where a supervised learning process is organised into two learning stages. In the first learning phase, a set of base classifiers are learned in parallel from a distributed data set. In the second learning phase, meta-learning is applied to induce the relationship between an attribute vector and the class predictions from all the base classifiers. By applying this approach to an environment where base classifiers are produced from distributed data sources, the output of Knowledge Probing process can be viewed as the assimilated knowledge of that distributed learning system. Some initial experimental results on the quality of the assimilated knowledge are presented. We believe that an integration of Knowledge Probing technique and the available data mining algorithms can provide a practical framework for distributed data mining applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
K. M. Ali and M. J. Pazzani. Error reduction through learning multiple descriptions. Machine Learning, 24(3):173–202, September 1996.
E. Bauer and R. Kohavi. An empirical comparison of voting classification algorithms: Bagging, boosting and variants. Machine Learning, Submitted:1–33, 1998.
L. Breiman. Heuristics of instability in model selection. Technical report, Statistics Department, University of California at Berkeley, California, 1994.
L. Breiman. Bagging predictors. Machine Learning, 24:123–140, 1996.
P. Chan and S. Stolfo. Meta-learning for multistrategy and parallel learning. In Proceeding of the Second International Work on Multistrategy Learning, pages 150–165, 1993.
P. Chan and S. Stolfo. Toward parallel and distributed learning by meta-learning. In In Working Notes AAAI Work. Knowledge Discovery in Databases, pages 227–240. AAAI, 1993.
P. Chan and S. Stolfo. On the accuracy of meta-learning for scalable data mining. Journal of Intelligent Information System, 8:5–28, 1996.
Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. In Proceedings of the Second European Conference on Computational Learning Theory, pages 23–37. Springer Verlag, 1995.
Y. Guo, S. Rüeger, J. Sutiwaraphun, and J. Forbes-Millott. Meta-learning for parallel data mining. In Proceedings of the Seventh Parallel Computing Workshop, pages 1–2. Fujitsu Laboratories Ltd., November 1997.
Y. Guo and J. Sutiwaraphun. Knowledge probing in distributed data mining. Technical report, Department of Computing, Imperial College, September 1998.
L. K. Hansen and P. Salamon. Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(10):993–1001, 1990.
R. Kohavi, D. Sommerfield, and J. Dougherty. Data mining using MLC++: A machine learning library in C++. In Tools With Artificial Intelligence 1996, pages 234–245. IEEE Computer Society Press, November 1996. http://www.sgi.com/Technology/mlc.
R. Kohavi and D. Wolpert. Bias plus variance decomposition for zero-one loss functions. In L. Saitta, editor, Machine Learning: Proceedings of the Thirteenth International Conference, pages 275–283. Morgan Kaufmann, 1996.
C. J. Merz and P. M. Murphy. UCI repository of machine learning databases. University of California, Department of Information and Computer Science, http://www.ics.uci.edu/~mlearn/MLRepository.html, 1996.
J. Sutiwaraphun. Investigating into distributed data mining. Technical report, Department of Computing, Imperial College, May 1998.
K. Yamanishi. Distributed cooperative bayesian learning strategies. In Proceedings of the 1997 10th Annual Conference on Computational Learning Theory, pages 250–262, Nashville, TN, July 1997. ACM, New York.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Guo, Y., Sutiwaraphun, J. (1999). Probing Knowledge in Distributed Data Mining. In: Zhong, N., Zhou, L. (eds) Methodologies for Knowledge Discovery and Data Mining. PAKDD 1999. Lecture Notes in Computer Science(), vol 1574. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48912-6_59
Download citation
DOI: https://doi.org/10.1007/3-540-48912-6_59
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65866-5
Online ISBN: 978-3-540-48912-2
eBook Packages: Springer Book Archive