Error reduction through learning multiple descriptions

Ali, Kamal M.; Pazzani, Michael J.

doi:10.1007/BF00058611

Error reduction through learning multiple descriptions

Published: September 1996

Volume 24, pages 173–202, (1996)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Error reduction through learning multiple descriptions

Download PDF

Kamal M. Ali¹ &
Michael J. Pazzani¹

1237 Accesses
167 Citations
Explore all metrics

Abstract

Learning multiple descriptions for each class in the data has been shown to reduce generalization error but the amount of error reduction varies greatly from domain to domain. This paper presents a novel empirical analysis that helps to understand this variation. Our hypothesis is that the amount of error reduction is linked to the “degree to which the descriptions for a class make errors in a correlated manner.” We present a precise and novel definition for this notion and use twenty-nine data sets to show that the amount of observed error reduction is negatively correlated with the degree to which the descriptions make errors in a correlated manner. We empirically show that it is possible to learn descriptions that make less correlated errors in domains in which many ties in the search evaluation measure (e.g. information gain) are experienced during learning. The paper also presents results that help to understand when and why multiple descriptions are a help (irrelevant attributes) and when they are not as much help (large amounts of class noise).

References

Ali K., & Pazzani M. (1992.) Reducing the small disjuncts problem by learning probabilistic concept descriptions. In Petsche T., Judd S. & Hanson S. (Eds.), Computational Learning Theory and Natural Learning Systems, Vol. 3. Cambridge, Massachusetts: MIT Press.
Google Scholar
Ali K., & Pazzani M. (1993.) HYDRA: A Noise-tolerant Relational Concept Learning Algorithm In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence Chambery, France: Morgan Kaufmann.
Google Scholar
Ali K. & Pazzani M. (1995a.) HYDRA-MM: Learning Multiple Descriptions to Improve Classification Accuracy International Journal on Artificial Intelligence Tools, 1 & 2, 115–133.
Google Scholar
Ali K., & Pazzani M. (1995b.) Learning Multiple Relational Rule-based Models. In Fisher D., & Lenz H. (Eds.), Learning from Data: Artificial Intelligence and Statistics, Vol. 5. Fort Lauderdale, FL: Springer-Verlag.
Google Scholar
Baxt W.G. (1992.) Improving the Accuracy of an Artificial Neural Network Using Multiple Differently Trained Networks. Neural Computation, 4, 772–780.
Google Scholar
Brazdil, P., & Torgo, L. (1990.) Knowledge Acquisition via Knowledge Integration. In Current Trends in Knowledge Acquisition: IOS Press.
Bernardo, J.M. & Smith, A.F.M. (1994.) Bayesian Theory. John Wiley.
Breiman, L. (1994.) Heuristics of instability in model selection. (Technical Report University of California, Berkeley). Statistics Department.
Breiman, L. (in press.) Bagging Predictors Machine Learning, 24, 123–140.
Buntine W. (1990.) A Theory of Learning Classification Rules. Doctoral dissertation. School of Computing Science, University of Technology, Sydney, Australia.
Clark, P., & Boswell, R. (1991.) Rule Induction with CN2: Some Recent Improvements. In Proceedings of the European Working Session on Learning, 1991: Pitman.
Danyluk A., & Provost F. (1993.) Small Disjuncts in Action: Learning to Diagnose Errors in the Local Loop of the Telephone Network. In Proceedings of the Tenth International Conference on Machine Learning. Amherst, MA: Morgan Kaufmann.
Google Scholar
DeGroot M.H. (1986.) Probability and Statistics. Reading, MA: Addison-Wesley.
Google Scholar
Drobnic, M. & Gams, M. (1992.) Analysis of Classification with Two Classifiers. In B. du Boulay and V.Sgurev, Artificial Intelligence 5: Methodology, Systems, and Applications. North-Holland.
Drobnic, M. & Gams, M. (1993.) Multistrategy Learning: An Analytical Approach. In Proc. 2nd Intern. Workshop on Multistrategy Learning. Harpers Ferry, WV.
Drucker H., Cortes C., Jackel L., LeCun Y. & Vapnik V. (1994.) Boosting and Other Machine Learning Algorithms. In Machine Learning: Proceedings of the Eleventh International Conference. New Brunswick, NJ: Morgan Kaufmann.
Google Scholar
Duda R., Gaschnig J., & Hart P. (1979.) Model design in the Prospector consultant system for mineral exploration. In D. Michie (ed.), Expert systems in the micro-electronic age. Edinburgh, England: Edinburgh University Press.
Google Scholar
Dzeroski S., & Bratko, (1992.) Handling noise in Inductive Logic Programming. In Proceedings of the International Workshop on Inductive Logic Programming. Tokyo, Japan: ICOT Press.
Google Scholar
Freund Y. & Schapire R.E. (1995.) A Decision-Theoretic Generalization of On-Line Learning and an application to Boosting. In Vitanyi P. (Ed.), Lecture Notes in Artificial Intelligence, Vol. 904. Berlin, Germany: Springer-Verlag.
Google Scholar
Gams, M., & Petkovsek, M. (1988.) Learning From Examples in the Presence of Noise. In 8th International Workshop; Expert Systems and their applications, Vol. 2 Avignon, France.
Gams M. (1989.) New Measurements Highlight the Importance of Redundant Knowledge. In European Working Session on Learning (4th: 1989) Montpeiller, France: Pitman.
Google Scholar
Hansen L.K., & Salamon P. (1990.) Neural Network Ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12, 10, 993–1001.
Google Scholar
Holte R., Acker L., & Porter B. (1989.) Concept Learning and the Problem of Small Disjuncts. In Proceedings of the Eleventh International Joint Conference on Artificial Intelligence. Detroit, MI: Morgan Kaufmann.
Google Scholar
Howell D. (1987.) Statistical Methods for Psychology. Boston, MA: Duxbury Press.
Google Scholar
Kong E.B., & Dietterich T. (1995.) Error-Correcting Output Coding Corrects Bias and Variance. In Machine Learning: Proceedings of the Twelfth International Conference on Machine Learning. Tahoe City, CA: Morgan Kaufmann.
Google Scholar
Kononenko I., & Kovacic M. (1992.) Learning as Optimization: Stochastic Generation of Multiple Knowledge. In Machine Learning: Proceedings of the Ninth International Workshop. Aberdeen, Scotland: Morgan Kaufmann.
Google Scholar
Kovacic M (1994.) MILP—a stochastic approach to Inductive Logic Programming. In Proceedings of the Fourth International Workshop on Inductive Logic Programming. Bad Honnef/Bonn, Germany: GMD Press.
Google Scholar
Kruskal W.H. and Tanur J.M (1978.) International encyclopedia of statistics. New York, NY: Free Press.
Google Scholar
Kwok S., & Carter C. (1990.) Multiple decision trees. Uncertainty in Artificial Intelligence, 4, 327–335.
Google Scholar
Lavrac, N. & Dzeroski, S. (1991.) Inductive learning of relational descriptions from noisy examples. In Proceedings of International Workshop on Inductive Logic Programming ILP-91. Viana de Castelo, Portugal.
Lloyd J.W. (1984.) Foundations of Logic Programming. Springer-Verlag.
Madigan, D., & York, J. (1993.) Bayesian Graphical Models for Discrete Data. (Technical Report UW-93-259). University of Washington, Statistics Department.
Michalski, R.S., & Stepp, R. (1983.) Learning from Observation: Conceptual Clustering. In Michalski, R.S., Carbonell, J.G., & Mitchell T.M. (Ed.s), Machine Learning: An Artificial Intelligence Approach. Tioga Publishing Co.
Muggleton S., Bain M., Hayes-Michie J., & Michie D. (1989.) An experimental comparison of human and machine-learning formalisms. In Proceedings of the Sixth International Workshop on Machine Learning. Ithaca, NY: Morgan Kaufmann.
Google Scholar
Muggleton, S. & Feng, C. (1990.) Efficient Induction of Logic Programs. In Proceedings of the Workshop on Algorithmic Learning Theory: Japanese Society for Artificial Intelligence.
Muggleton S., Srinivasan A., & Bain M. (1992.) Compression, Significance and Accuracy. In Machine Learning: Proceedings of the Ninth International Workshop. Aberdeen, Scotland: Morgan Kaufmann.
Google Scholar
Murphy, P.M., & Aha D.W. (1992.) UCI repository of machine learning databases (a machine-readable data repository). Maintained at the Department of Information and Computer Science, University of California, Irvine, CA. Data sets are available by anonymous ftp at ics.uci.edu in the directory pub/machine-learning-databases.
Lavrac N. & Dzeroski S. (1992.) Background knowledge and declarative bias in inductive concept learning. In Oantke, K., Proceedings of the Third International Workshop on Analogical and Inductive Inference. Berlin, Germany: Springer.
Google Scholar
Pazzani M., & Brunk C. (1991.) Detecting and correcting errors in rule-based expert systems: an integration of empirical and explanation-based learning. Knowledge Acquisition, 3, 157–173.
Google Scholar
Pazzani M., Brunk C., & Silverstein G. (1991.) A knowledge-intensive approach to learning relational concepts. In Machine Learning: Proceedings of the Eighth International Workshop (ML91). Ithaca, NY: Morgan Kaufmann.
Google Scholar
Perrone, M. (1993.) Improving Regression Estimation: Averaging Methods for Variance Reduction with Extensions to General Convex Measure Optimization. Doctoral dissertation. Department of Physics, Brown University.
Quinlan R. (1986.) Induction of Decision Trees. Machine Learning, 1, 1, 81–106.
Google Scholar
Quinlan R. (1990.) Learning logical definitions from relations. Machine Learning, 5, 3, 239–266.
Google Scholar
Quinlan R. (1991.) Technical note: Improved Estimates for the Accuracy of Small Disjuncts. Machine Learning, 6, 1, 93–98.
Google Scholar
Ripley, B.D. (1987.) Stochastic Simulation. John Wiley & Sons.
Schapire R. (1990.) The strength of Weak Learnability. Machine Learning, 5, 2, 197–227.
Google Scholar
Smyth P., Goodman R.M., & Higgins C. (1990.) A Hybrid Rule-Based/Bayesian Classifier. In Proceedings of the 1990 European Conference on Artificial Intelligence London, UK: Pitman.
Google Scholar
Smyth P. & Goodman R. (1992.) Rule Induction Using Information Theory. In G. Piatetsky-Shapiro (ed.), Knowledge Discovery in Databases. Menlo Park, CA: AAAI Press, MIT Press.
Google Scholar
Spackman K. (1988.) Learning Categorical Decision Criteria in Biomedical Domains. In Proceedings of the Fifth International Conference on Machine Learning. Ann Arbor, MI: Morgan Kaufmann.
Google Scholar
TorgoL. (1993.) Rule Combination in Inductive Learning. In Machine Learning: ECML 93. Vienna, Austria: Springer-Verlag.
Google Scholar
Towell G., Shavlik J., & Noordewier M. (1990.) Refinement of Approximate Domain Theories by Knowledge-Based Artificial Neural Networks. In Proceedings of the Eighth National Conference on Artificial Intelligence (AAAI-90). Boston, MA: AAAI Press.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information and Computer Science, University of California, 92717, Irvine, CA
Kamal M. Ali & Michael J. Pazzani

Authors

Kamal M. Ali
View author publications
You can also search for this author in PubMed Google Scholar
Michael J. Pazzani
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ali, K.M., Pazzani, M.J. Error reduction through learning multiple descriptions. Mach Learn 24, 173–202 (1996). https://doi.org/10.1007/BF00058611

Download citation

Received: 18 November 1994
Accepted: 24 July 1995
Issue Date: September 1996
DOI: https://doi.org/10.1007/BF00058611

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Error reduction through learning multiple descriptions

Abstract

Article PDF

Similar content being viewed by others

Meta-Learning

Inductive Transfer

Inductive Transfer

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Error reduction through learning multiple descriptions

Abstract

Article PDF

Similar content being viewed by others

Meta-Learning

Inductive Transfer

Inductive Transfer

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation