Mining Several Databases with an Ensemble of Classifiers

  • Seppo Puuronen
  • Vagan Terziyan
  • Alexander Logvinovsky
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1677)


The results of knowledge discovery in data bases could vary depending on the data mining method. There are several ways to select the most appropriate data mining method dynamically. One proposed method clusters the whole domain area into “competence areas” of the methods. A metamethod is then used to decide which data mining method should be used with each data base instance. However, when knowledge is extracted from several data bases knowledge discovery may produce conflicting results even if the separate data bases are consistent. At least two types of conflicts may arise. The first type is created by data inconsistency within the area of the intersection of the data bases. The second type of conflicts is created when the metamethod selects different data mining methods with inconsistent competence maps for the objects of the intersected part. We analyze these two types of conflicts and their combinations and suggest ways to handle them.


Data Base Classification Method Classification Result Training Instance Integral Weight 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Chan, P.: An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. Ph.D. Thesis, Columbia University (1996)Google Scholar
  2. 2.
    Chan, P., Stolfo, S.: On the Accuracy of Meta-Learning for Scalable Data Mining. Intelligent Information Systems 8 (1997) 5–28CrossRefGoogle Scholar
  3. 3.
    Dietterich, T.: Machine Learning Research: Four Current Directions. AI Magazine 18 (1997) 97–136Google Scholar
  4. 4.
    Fayyad, U., et al.: Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press (1997)Google Scholar
  5. 5.
    Kohavi, R.: A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. In: The Proceedings of IJCAI’95 (1995)Google Scholar
  6. 6.
    Koppel, M., Engelson, S.: Integrating Multiple Classifiers by Finding their Areas of Expertise. In: AAAI-96 Workshop On Integrating Multiple Learning Models (1996) 53–58Google Scholar
  7. 7.
    Liu, H., Lu, H., Yao, J.: Identifying Relevant Databases for Multidatabase Mining. In: The Proceedings of the PAKDD’98, Melbourne, Australia, Springer Verlag (1998)Google Scholar
  8. 8.
    Merz, C.: Dynamical Selection of Learning Algorithms. In: Fisher, D., Lenz, H.-J. (eds.), Learning from Data, Artificial Intelligence and Statistics, Springer Verlag, NY (1996)Google Scholar
  9. 9.
    Skalak, D.: Combining Nearest Neighbor Classifiers. Ph.D. Thesis, Dept. of Computer Science, University of Massachusetts, Amherst, MA (1997)Google Scholar
  10. 10.
    Terziyan, V., Tsymbal, A., Puuronen, S.: The Decision Support System for Telemedicine Based on Multiple Expertise. International Journal of Medical Informatics 49 (1998) 217–229CrossRefGoogle Scholar
  11. 11.
    Tsymbal, A., Puuronen, S., Terziyan, V.: Advanced Dynamic Selection of Diagnostic Methods. In: Proceedings of the CBMS’98, IEEE CS Press, Lubbock, Texas (1998) 50–54Google Scholar
  12. 12.
    Wolpert, D.: Stacked Generalization. Neural Networks 5 (1992) 241–259CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Seppo Puuronen
    • 1
  • Vagan Terziyan
    • 2
  • Alexander Logvinovsky
    • 2
  1. 1.University of JyväskyläJyväskyläFinland
  2. 2.Kharkov State Technical University of RadioelectronicsKharkovUkraine

Personalised recommendations