Abstract
An innovative taxonomy for the classification of classifiers is presented. This new family of meta-classifiers called Meta-Net, having its foundation in the theory of independent judges, is introduced, defined, described, and shown to possess very good performance when compared to other known meta-classifiers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We gratefully acknowledge Marco Intraligi (Semeion Staff) who helped the authors during the training sessions.
- 2.
This detailed analysis of the shared misclassifications among algorithms was conducted thanks to a suggestion of Dr. Giulia Massini (Semeion researcher).
- 3.
We acknowledge Dr. Giulia Massini (2010–2011) (Semeion researcher) who has generated this dendrogram, using specific software she wrote.
References
Aha DW, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6:37–66
Amasyali MF, Ersoy OK (2009) A study of meta learning for regression. ECE technical reports, paper 386. http://docs.lib.purdue.edu/ecetr/386
Anderson JA, Rosenfeld E (eds) (1988) Neurocomputing foundations of research. The MIT Press, Cambridge, MA
Arbib MA (ed) (1995) The handbook of brain theory and neural networks. A Bradford book. The MIT Press, Cambridge, MA
Asuncion A, Newman DJ (2007) UCI Machine Learning Repository [http://www.ics.uci.edu/~mlearn/MLRepository.html]. University of California, School of Information and Computer Science, Irvine
Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, Oxford
Breiman L et al (1993) Classification and regression trees. Chapman and Hall, Boca Raton
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Breiman L (1998a) Arcing classifiers. Ann Stat 26(3):801–849
Breiman L (1998b) Arcing classifiers. Ann Stat 26(3):801–849
Breiman L (2001) Random forest. Mach Learn 45(1):5–32
Bremner D, Demaine E, Erickson J, Iacono J, Langerman S, Morin P, Toussaint G (2005) Output-sensitive algorithms for computing nearest-neighbor decision boundaries. Discrete Comput Geometry 33(4):593–604
Buscema M (ed) (1998a) Artificial neural networks and complex social systems – 2. Models, substance use & misuse, vol 33(2). Marcel Dekker, New York
Buscema M (ed) (1998b) Artificial neural networks and complex social systems – 3. Applications, substance use & misuse, vol 33(3). Marcel Dekker, New York
Buscema M (ed) (1998d) Artificial neural networks and complex social systems – 1. Theory, substance use & misuse, vol 33(1). Marcel Dekker, New York
Buscema M, Benzi R (2011) Quakes prediction using a highly non linear system and a minimal dataset. In: Buscema M, Ruggieri M (eds) Advanced networks, algorithms and modeling for earthquake prediction. River Publisher Series in Information Science and Technology, Aalborg
Buscema M (1998e) MetaNet: the theory of independent judges. In: Substance use & misuse, vol 33(2) (Models). Marcel Dekker, Inc., New York, pp 439–461
Buscema M (1999–2010) Supervised ANNs. Semeion software #12, version 16.0
Buscema M (2004) Genetic doping algorithm (GenD): theory and application. Expert Syst 21:2
Buscema M (2008–2010) MetaNets. Semeion Software #44, version 8.0
Buscema M et al (1999) Reti Neurali Artificiali e Sistemi Sociali Complessi. Vol II – Applicazioni. Franco Angeli, Milano, pp 288–291 [Artificial Neural Networks and Complex Systems. Applications]
Buscema M, Grossi E, Intraligi M, Garbagna N (2005) An optimized experimental protocol based on neuro evolutionary algorithms. Application to the classification of dyspeptic patients and to the prediction of the effectiveness of their treatment. Artif Intell Med 34:279–305
Buscema M, Terzi S, Tastle W (2010) A new meta-classifier. In: NAFIPS 2010, 12–14 July, Toronto
Buscema M, Terzi S, Breda M (2006) Using sinusoidal modulated weights improve feed-forward neural networks performances in classification and functional approximation problems. WSEAS Trans Inf Sci Appl 3(5):885–893
Buscema M (1998e) Back propagation neural networks. Subst Use Misuse 33(2):233–270
Carpenter GA, Grossberg S (1991) Pattern recognition by self-organizing neural network. MIT Press, Cambridge, MA
Chapelle O (2005) Active learning for Parzen window classifier. In: Proceedings of the tenth international workshop on artificial intelligence and statistics, pp 49–56. Electronic Paper at http://eprints.pascal-network.org/archive/00000387/02/aistats.pdf
Chauvin Y, Rumelhart DE (eds) (1995) Backpropagation: theory, architectures, and applications. Lawrence Erlbaum Associates, Inc. Publishers, Brodway-Hillsdale
Cho SB, Kim JH (1995) Combining multiple neural networks by fuzzy integral and robust classification. IEEE Trans Syst Man Cybern 25:380–384
Cleary JG, Trigg LE (1995) K*: an instance-based learner using an entropic distance measure. Machine learning inernational conference. Morgan Kaufmann Publishers, San Francisco, pp 108–114
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27
Day WHE (1988) Consensus methods as tools for data analysis. In: Bock HH (ed) Classification and related methods for data analysis. Elsevier Science Publishers, North Holland, pp 317–324
Denoeux T (1995) A k-nearest neighbor classification rule based on Dempster–Shafer theory. IEEE Trans Syst Man Cybern 25:804–813
Dietterich T (2002) Ensemble learning. In: Arbib MA (ed) The handbook of brain theory and neural networks, 2nd edn. The MIT Press, Cambridge, MA
Dong SL, Frank L, Kramer E (2005) Ensembles of balanced nested dichotomies for multi-class problems. Lecture notes in computer science. NUMB 3721, Springer, Berlin, pp 84–95
Duda RO, Hart PE, Stork DG (2001) Pattern classification. Wiley, New York
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Friedman N, Geiger D, Goldszmidt M (1997) Bayesian networks classifiers. Mach Learn 29:131–163
Geva S, Sitte J (1991) Adaptive nearest neighbor pattern classification. IEEE Trans Neural Netw 2(2):318–322
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. SIGKDD Explorations, 11(1)
Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844
Ho TK (2001) Data complexity analysis for classifier combination. In: Proceedings of the international workshop on multiple classifier systems. LNCS, vol 2096. Springer, Cambridge, UK, pp 53–67
Holbech S, Nielsen TD (2008) Adapting Bayes network structures to non-stationary domains. Int J Approx Reason 49:379–397
Hopfield JJ (1988) Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci 79:2554–2558
Jacobs RA (1988) Increased rates of convergence through learning rate adaptation. Neural Netw 1(4):295–307
John GH, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the eleventh conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers, San Mateo
Kamath C, Cantú-Paz E and Littau D (2001) Approximate splitting for ensembles of trees using histograms. Preprint UCRL-JC-145576, Lawrence Livermore National Laboratory, 1 Oct. US Dept of Energy, Sept 2001
Keerthi SS, Shevade SK, Bhattacharyya C, Murthy KRK (2001) Improvements to Platt’s SMO algorithm for SVM classifier design. Neural Comput 13:637–649
Kibriya AM, Frank E, Pfahringer B, Holmes G (2005) Multinomial Naïve Bayes for text categorization revisited. Lecture Notes Comput Sci 3339:235–252
Kittler J, Hatef M, Duin RPW, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20(3):226–239
Klir GJ (1985) Architecture of systems problem solving. Plenum Press, New York
Kohavi R, Provost F (1998) Glossary of terms. Editorial for the special issue on applications of machine learning and the knowledge discovery process, vol 30(2/3)
Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the fourteenth international joint conference on artificial intelligence. Morgan Kaufmann, San Mateo
Kohonen T (1990) Improved versions of learning vector quantization, 1st edn. International Joint Conference on Neural Networks, San Diego, pp 545–550
Kohonen T (1995) Self-organizing maps. Springer, Berlin
Kuncheva LI (2000) Clustering-and-Selection model for classifier combination. In: Proceedings of the knowledge-based intelligent engineering systems and allied technologies, Brighton, pp 185–188
Kuncheva LI (2001) Switching between selection and fusion in combining classifiers: an experiment. IEEE Trans Syst Man Cybern B 32:146–156
Kuncheva LI (2004) Combining pattern classifiers: methods and algorithms. John Wiley and Sons, Inc., Hoboken, pp 112–125
Kuncheva LI, Bezdek JC, Duin RPW (2001) Decision templates for multiple classifier fusion: an experimental comparison. Pattern Recognit 34(2):299–314
LeFevre K, DeWitt DJ, Ramakrishnan R (2006) Workload Aware Anonymization, KDD’06, Philadelphia, 20–23 Aug
Liu CL (2005) Classifier combination based on confidence transformation. Pattern Recognit 38(1):11–28
Livingston F (2005) Implementing Breiman’s random forest algorithm into Weka. ECE591Q machine learning conference papers, 27 Nov
Massini G (2010–2011) MST class. Semeion software #56, ver. 1.0, Rome
Matlab (2005) Version 7
McClelland JL, Rumelhart DE (1988) Explorations in parallel distributed processing. The MIT Press, Cambridge, MA
Melville P, Mooney RJ (2003) Constructing diverse classifier ensembles using artificial training examples. In: Proceedings of the IJCAI-2003, Acapulco, pp 505–510
Mohammed HS, Leander J, Marbach M, Polikar R (2006) Can AdaBoost.M1 Learn Incrementally? A comparison to learn++ under different combination rules. In: Kollias S et al (eds) ICANN 2006, Part I, LNCS 4131, pp 254–263, Springer, Berlin
Moody J, Darken CJ (1989) Fast learning in networks of locally tuned processing units. Neural Comput 1:281–294
Neal RM (1996) Bayesian learning for neural networks. Springer, New York
NeuralWare (1995) Neural computing. NeuralWare Inc., Pittsburgh
NeuralWare (1998) Neuralworks Professional II/Plus, version 5.35
Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 33:1065–1076
Patterson D (1996) Artificial neural networks. Prentice Hall, Singapore
Pearl J (1988) Probabilistic reasoning in intelligent systems, representation & reasoning. Morgan Kaufmann Publishers, San Mateo
Platt JC (2000) Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. In: Smola AJ, Bartlett P, Scholkopf B, Schuurmans D (eds) Advances in large margin classifiers. MIT Press, Cambridge, MA
Poggio T, Girosi F (1994) A theory of network of approximation and learning. The MIT Press, Cambridge, MA
Powell MJD (1985) Radial basis function for multi-variable interpolation: a review. IMA conference on algorithms for the approximation of function and data. RMCS, Shrivenham. Also report DAMTP/NA12, Department of Applied Mathematics and Theoretical Physics, University of Cambridge
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufman, San Mateo
Quinlan JR (1996) Improve use of continuous attributes in C4.5. J Artif Intell Res 4:77–90
Ripley BD (1996) Pattern recognition and neural networks. Cambridge University Press, Cambridge, UK
Rish I (2001) An empirical study of the naïve Bayes classifier. In: IBM research report, RC 22230 (W0111-014), New York
Rodriguez JJ, Kuncheva LI, Alonso CJ (2006) Rotation forest: a new classifier ensemble method. IEEE Trans Pattern Anal Mach Intell 28(10):1619–1630
Rogova G (1994) Combining the results of several neural network classifiers. Neural Netw 7:777–781
Rokach L, Mainon O (2001) Theory and applications of attribute decomposition. IEEE international conference on Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on 29 Nov-2 Dec 2001
Rokach L (2009) Taxonomy for characterizing ensemble methods in classification tasks: a review and annotated bibliography. Comput Stat Data Anal 53:4046–4072
Rumelhart DE, McClelland JL (eds) (1986) Parallel distributed processing, vol 1. Foundations, explorations in the microstructure of cognition, vol 2. Psychological and biological models. The MIT Press, Cambridge, MA
Rumelhart DE, Hinton GE, Williams RJ (1986a) Learning internal representations by error propagation. In: Rumelhart DE, McClelland JL (eds) Parallel distributed processing, 1st edn, Foundations, Explorations in the Microstructure of Cognition. The MIT Press, Cambridge, MA
Rumelhart DE, Smolensky P, McClelland JL, Hinton GE (1986b) Schemata and sequential thought processes in PDP models. In: McClelland JL, Rumelhart DE, the PDP Group, Parallel distributed processing, vol II. MIT Press, Cambridge, MA, pp 7–57
Simpson P (ed) (1996) Neural networks. Theory, technology, and applications. IEEE Technology Update Series, New York
Smyth P, Wolpert DH (1999) Linearly combining density estimators via stacking. Mach Learn 36:59–83
Srivastava S, Gupta MR, Frigyik BA (2007) Bayesian quadratic discriminant analysis. J Mach Learn Res 8:1277–1305
Stefanowski J, Nowaczyk S (2006) An experimental study of using rule induction algorithm in combiner multiple classifier. Int J Comput Intell Res 2. http://home.agh.edu.pl/~nowaczyk/research/IJCIR.pdf
Tastle WJ, Wierman MJ (2007) Consensus and dissention: a measure of ordinal dispersion. Int J Approx Reason 45(3):531–545
Tastle WJ, Wierman MJ, Dumdum UR (2005) Ranking ordinal scales using the consensus measure. Issues Inf Syst VI(2):96–102
Turney PD (1993) Robust classification with context-sensitive features. In: Proceedings of the sixth international conference on industrial and engineering applications of artificial intelligence and expert systems (IEA/AIE-93), Edinburgh
Valentini G, Masulli F (2002) Ensembles of learning machines. In: Tagliaferri R, Marinaro M (eds) Neural nets, WIRN. Lecture Notes in Computer Science, vol 2486. Springer, Berlin, pp 3–19
Wang PS (ed) (2010) Pattern recognition and machine vision. River Publishers, Aalborg
Wernecke KD (1992) A coupling procedure for the discrimination of mixed data. Biometrics 48(2):497–506
Witten IH, Frank E (2005) Data mining. Morgan Kaufmann, San Francisco
Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259
Woods K, Kegelmeyer WP, Bowyer K (1997) Combination of multiple classifiers using local accuracy estimates. IEEE Trans Pattern Anal Mach Intell 19:405–410
Zang H (2004) The optimality naive Bayes. Am Assoc Artif Intell, www.aaai.org
Zimmermann HJ (1996) Fuzzy set theory and its applications, 3rd edn. Kluwer, Boston/Dordrecht/London
Zorkadis V, Karras DA, Panayotou M (2005) Efficient information theoretic strategies for classifier combination, feature extraction and performance evaluation in improving false positives and false negatives for spam e-mail filtering. Neural Netw 18(5–6), IJCNN 2005, July–Aug, pp 799–807
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this chapter
Cite this chapter
Buscema, M., Tastle, W.J., Terzi, S. (2013). Meta Net: A New Meta-Classifier Family. In: Tastle, W. (eds) Data Mining Applications Using Artificial Adaptive Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-4223-3_5
Download citation
DOI: https://doi.org/10.1007/978-1-4614-4223-3_5
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-4222-6
Online ISBN: 978-1-4614-4223-3
eBook Packages: Computer ScienceComputer Science (R0)