Skip to main content

Meta Net: A New Meta-Classifier Family

  • Chapter
  • First Online:
Data Mining Applications Using Artificial Adaptive Systems

Abstract

An innovative taxonomy for the classification of classifiers is presented. This new family of meta-classifiers called Meta-Net, having its foundation in the theory of independent judges, is introduced, defined, described, and shown to possess very good performance when compared to other known meta-classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We gratefully acknowledge Marco Intraligi (Semeion Staff) who helped the authors during the training sessions.

  2. 2.

    This detailed analysis of the shared misclassifications among algorithms was conducted thanks to a suggestion of Dr. Giulia Massini (Semeion researcher).

  3. 3.

    We acknowledge Dr. Giulia Massini (2010–2011) (Semeion researcher) who has generated this dendrogram, using specific software she wrote.

References

  • Aha DW, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6:37–66

    Google Scholar 

  • Amasyali MF, Ersoy OK (2009) A study of meta learning for regression. ECE technical reports, paper 386. http://docs.lib.purdue.edu/ecetr/386

  • Anderson JA, Rosenfeld E (eds) (1988) Neurocomputing foundations of research. The MIT Press, Cambridge, MA

    Google Scholar 

  • Arbib MA (ed) (1995) The handbook of brain theory and neural networks. A Bradford book. The MIT Press, Cambridge, MA

    Google Scholar 

  • Asuncion A, Newman DJ (2007) UCI Machine Learning Repository [http://www.ics.uci.edu/~mlearn/MLRepository.html]. University of California, School of Information and Computer Science, Irvine

  • Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, Oxford

    Google Scholar 

  • Breiman L et al (1993) Classification and regression trees. Chapman and Hall, Boca Raton

    Google Scholar 

  • Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    MathSciNet  MATH  Google Scholar 

  • Breiman L (1998a) Arcing classifiers. Ann Stat 26(3):801–849

    Article  MathSciNet  MATH  Google Scholar 

  • Breiman L (1998b) Arcing classifiers. Ann Stat 26(3):801–849

    Article  MathSciNet  MATH  Google Scholar 

  • Breiman L (2001) Random forest. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  • Bremner D, Demaine E, Erickson J, Iacono J, Langerman S, Morin P, Toussaint G (2005) Output-sensitive algorithms for computing nearest-neighbor decision boundaries. Discrete Comput Geometry 33(4):593–604

    Article  MathSciNet  MATH  Google Scholar 

  • Buscema M (ed) (1998a) Artificial neural networks and complex social systems – 2. Models, substance use & misuse, vol 33(2). Marcel Dekker, New York

    Google Scholar 

  • Buscema M (ed) (1998b) Artificial neural networks and complex social systems – 3. Applications, substance use & misuse, vol 33(3). Marcel Dekker, New York

    Google Scholar 

  • Buscema M (ed) (1998d) Artificial neural networks and complex social systems – 1. Theory, substance use & misuse, vol 33(1). Marcel Dekker, New York

    Google Scholar 

  • Buscema M, Benzi R (2011) Quakes prediction using a highly non linear system and a minimal dataset. In: Buscema M, Ruggieri M (eds) Advanced networks, algorithms and modeling for earthquake prediction. River Publisher Series in Information Science and Technology, Aalborg

    Google Scholar 

  • Buscema M (1998e) MetaNet: the theory of independent judges. In: Substance use & misuse, vol 33(2) (Models). Marcel Dekker, Inc., New York, pp 439–461

    Google Scholar 

  • Buscema M (1999–2010) Supervised ANNs. Semeion software #12, version 16.0

    Google Scholar 

  • Buscema M (2004) Genetic doping algorithm (GenD): theory and application. Expert Syst 21:2

    Article  Google Scholar 

  • Buscema M (2008–2010) MetaNets. Semeion Software #44, version 8.0

    Google Scholar 

  • Buscema M et al (1999) Reti Neurali Artificiali e Sistemi Sociali Complessi. Vol II – Applicazioni. Franco Angeli, Milano, pp 288–291 [Artificial Neural Networks and Complex Systems. Applications]

    Google Scholar 

  • Buscema M, Grossi E, Intraligi M, Garbagna N (2005) An optimized experimental protocol based on neuro evolutionary algorithms. Application to the classification of dyspeptic patients and to the prediction of the effectiveness of their treatment. Artif Intell Med 34:279–305

    Article  Google Scholar 

  • Buscema M, Terzi S, Tastle W (2010) A new meta-classifier. In: NAFIPS 2010, 12–14 July, Toronto

    Google Scholar 

  • Buscema M, Terzi S, Breda M (2006) Using sinusoidal modulated weights improve feed-forward neural networks performances in classification and functional approximation problems. WSEAS Trans Inf Sci Appl 3(5):885–893

    Google Scholar 

  • Buscema M (1998e) Back propagation neural networks. Subst Use Misuse 33(2):233–270

    Article  Google Scholar 

  • Carpenter GA, Grossberg S (1991) Pattern recognition by self-organizing neural network. MIT Press, Cambridge, MA

    Google Scholar 

  • Chapelle O (2005) Active learning for Parzen window classifier. In: Proceedings of the tenth international workshop on artificial intelligence and statistics, pp 49–56. Electronic Paper at http://eprints.pascal-network.org/archive/00000387/02/aistats.pdf

  • Chauvin Y, Rumelhart DE (eds) (1995) Backpropagation: theory, architectures, and applications. Lawrence Erlbaum Associates, Inc. Publishers, Brodway-Hillsdale

    Google Scholar 

  • Cho SB, Kim JH (1995) Combining multiple neural networks by fuzzy integral and robust classification. IEEE Trans Syst Man Cybern 25:380–384

    Article  Google Scholar 

  • Cleary JG, Trigg LE (1995) K*: an instance-based learner using an entropic distance measure. Machine learning inernational conference. Morgan Kaufmann Publishers, San Francisco, pp 108–114

    Google Scholar 

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  • Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27

    Article  MATH  Google Scholar 

  • Day WHE (1988) Consensus methods as tools for data analysis. In: Bock HH (ed) Classification and related methods for data analysis. Elsevier Science Publishers, North Holland, pp 317–324

    Google Scholar 

  • Denoeux T (1995) A k-nearest neighbor classification rule based on Dempster–Shafer theory. IEEE Trans Syst Man Cybern 25:804–813

    Article  Google Scholar 

  • Dietterich T (2002) Ensemble learning. In: Arbib MA (ed) The handbook of brain theory and neural networks, 2nd edn. The MIT Press, Cambridge, MA

    Google Scholar 

  • Dong SL, Frank L, Kramer E (2005) Ensembles of balanced nested dichotomies for multi-class problems. Lecture notes in computer science. NUMB 3721, Springer, Berlin, pp 84–95

    Google Scholar 

  • Duda RO, Hart PE, Stork DG (2001) Pattern classification. Wiley, New York

    MATH  Google Scholar 

  • Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139

    Article  MathSciNet  MATH  Google Scholar 

  • Friedman N, Geiger D, Goldszmidt M (1997) Bayesian networks classifiers. Mach Learn 29:131–163

    Article  MATH  Google Scholar 

  • Geva S, Sitte J (1991) Adaptive nearest neighbor pattern classification. IEEE Trans Neural Netw 2(2):318–322

    Google Scholar 

  • Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. SIGKDD Explorations, 11(1)

    Google Scholar 

  • Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844

    Article  Google Scholar 

  • Ho TK (2001) Data complexity analysis for classifier combination. In: Proceedings of the international workshop on multiple classifier systems. LNCS, vol 2096. Springer, Cambridge, UK, pp 53–67

    Google Scholar 

  • Holbech S, Nielsen TD (2008) Adapting Bayes network structures to non-stationary domains. Int J Approx Reason 49:379–397

    Article  MATH  Google Scholar 

  • Hopfield JJ (1988) Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci 79:2554–2558

    Article  MathSciNet  Google Scholar 

  • Jacobs RA (1988) Increased rates of convergence through learning rate adaptation. Neural Netw 1(4):295–307

    Article  Google Scholar 

  • John GH, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the eleventh conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers, San Mateo

    Google Scholar 

  • Kamath C, Cantú-Paz E and Littau D (2001) Approximate splitting for ensembles of trees using histograms. Preprint UCRL-JC-145576, Lawrence Livermore National Laboratory, 1 Oct. US Dept of Energy, Sept 2001

    Google Scholar 

  • Keerthi SS, Shevade SK, Bhattacharyya C, Murthy KRK (2001) Improvements to Platt’s SMO algorithm for SVM classifier design. Neural Comput 13:637–649

    Article  MATH  Google Scholar 

  • Kibriya AM, Frank E, Pfahringer B, Holmes G (2005) Multinomial Naïve Bayes for text categorization revisited. Lecture Notes Comput Sci 3339:235–252

    Google Scholar 

  • Kittler J, Hatef M, Duin RPW, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20(3):226–239

    Article  Google Scholar 

  • Klir GJ (1985) Architecture of systems problem solving. Plenum Press, New York

    MATH  Google Scholar 

  • Kohavi R, Provost F (1998) Glossary of terms. Editorial for the special issue on applications of machine learning and the knowledge discovery process, vol 30(2/3)

    Google Scholar 

  • Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the fourteenth international joint conference on artificial intelligence. Morgan Kaufmann, San Mateo

    Google Scholar 

  • Kohonen T (1990) Improved versions of learning vector quantization, 1st edn. International Joint Conference on Neural Networks, San Diego, pp 545–550

    Google Scholar 

  • Kohonen T (1995) Self-organizing maps. Springer, Berlin

    Book  Google Scholar 

  • Kuncheva LI (2000) Clustering-and-Selection model for classifier combination. In: Proceedings of the knowledge-based intelligent engineering systems and allied technologies, Brighton, pp 185–188

    Google Scholar 

  • Kuncheva LI (2001) Switching between selection and fusion in combining classifiers: an experiment. IEEE Trans Syst Man Cybern B 32:146–156

    Article  Google Scholar 

  • Kuncheva LI (2004) Combining pattern classifiers: methods and algorithms. John Wiley and Sons, Inc., Hoboken, pp 112–125

    Book  MATH  Google Scholar 

  • Kuncheva LI, Bezdek JC, Duin RPW (2001) Decision templates for multiple classifier fusion: an experimental comparison. Pattern Recognit 34(2):299–314

    Article  MATH  Google Scholar 

  • LeFevre K, DeWitt DJ, Ramakrishnan R (2006) Workload Aware Anonymization, KDD’06, Philadelphia, 20–23 Aug

    Google Scholar 

  • Liu CL (2005) Classifier combination based on confidence transformation. Pattern Recognit 38(1):11–28

    Article  MATH  Google Scholar 

  • Livingston F (2005) Implementing Breiman’s random forest algorithm into Weka. ECE591Q machine learning conference papers, 27 Nov

    Google Scholar 

  • Massini G (2010–2011) MST class. Semeion software #56, ver. 1.0, Rome

    Google Scholar 

  • Matlab (2005) Version 7

    Google Scholar 

  • McClelland JL, Rumelhart DE (1988) Explorations in parallel distributed processing. The MIT Press, Cambridge, MA

    Google Scholar 

  • Melville P, Mooney RJ (2003) Constructing diverse classifier ensembles using artificial training examples. In: Proceedings of the IJCAI-2003, Acapulco, pp 505–510

    Google Scholar 

  • Mohammed HS, Leander J, Marbach M, Polikar R (2006) Can AdaBoost.M1 Learn Incrementally? A comparison to learn++ under different combination rules. In: Kollias S et al (eds) ICANN 2006, Part I, LNCS 4131, pp 254–263, Springer, Berlin

    Google Scholar 

  • Moody J, Darken CJ (1989) Fast learning in networks of locally tuned processing units. Neural Comput 1:281–294

    Article  Google Scholar 

  • Neal RM (1996) Bayesian learning for neural networks. Springer, New York

    Book  MATH  Google Scholar 

  • NeuralWare (1995) Neural computing. NeuralWare Inc., Pittsburgh

    Google Scholar 

  • NeuralWare (1998) Neuralworks Professional II/Plus, version 5.35

    Google Scholar 

  • Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 33:1065–1076

    Article  MathSciNet  MATH  Google Scholar 

  • Patterson D (1996) Artificial neural networks. Prentice Hall, Singapore

    MATH  Google Scholar 

  • Pearl J (1988) Probabilistic reasoning in intelligent systems, representation & reasoning. Morgan Kaufmann Publishers, San Mateo

    Google Scholar 

  • Platt JC (2000) Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. In: Smola AJ, Bartlett P, Scholkopf B, Schuurmans D (eds) Advances in large margin classifiers. MIT Press, Cambridge, MA

    Google Scholar 

  • Poggio T, Girosi F (1994) A theory of network of approximation and learning. The MIT Press, Cambridge, MA

    Google Scholar 

  • Powell MJD (1985) Radial basis function for multi-variable interpolation: a review. IMA conference on algorithms for the approximation of function and data. RMCS, Shrivenham. Also report DAMTP/NA12, Department of Applied Mathematics and Theoretical Physics, University of Cambridge

    Google Scholar 

  • Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufman, San Mateo

    Google Scholar 

  • Quinlan JR (1996) Improve use of continuous attributes in C4.5. J Artif Intell Res 4:77–90

    MATH  Google Scholar 

  • Ripley BD (1996) Pattern recognition and neural networks. Cambridge University Press, Cambridge, UK

    MATH  Google Scholar 

  • Rish I (2001) An empirical study of the naïve Bayes classifier. In: IBM research report, RC 22230 (W0111-014), New York

    Google Scholar 

  • Rodriguez JJ, Kuncheva LI, Alonso CJ (2006) Rotation forest: a new classifier ensemble method. IEEE Trans Pattern Anal Mach Intell 28(10):1619–1630

    Article  Google Scholar 

  • Rogova G (1994) Combining the results of several neural network classifiers. Neural Netw 7:777–781

    Article  Google Scholar 

  • Rokach L, Mainon O (2001) Theory and applications of attribute decomposition. IEEE international conference on Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on 29 Nov-2 Dec 2001

    Google Scholar 

  • Rokach L (2009) Taxonomy for characterizing ensemble methods in classification tasks: a review and annotated bibliography. Comput Stat Data Anal 53:4046–4072

    Article  MathSciNet  MATH  Google Scholar 

  • Rumelhart DE, McClelland JL (eds) (1986) Parallel distributed processing, vol 1. Foundations, explorations in the microstructure of cognition, vol 2. Psychological and biological models. The MIT Press, Cambridge, MA

    Google Scholar 

  • Rumelhart DE, Hinton GE, Williams RJ (1986a) Learning internal representations by error propagation. In: Rumelhart DE, McClelland JL (eds) Parallel distributed processing, 1st edn, Foundations, Explorations in the Microstructure of Cognition. The MIT Press, Cambridge, MA

    Google Scholar 

  • Rumelhart DE, Smolensky P, McClelland JL, Hinton GE (1986b) Schemata and sequential thought processes in PDP models. In: McClelland JL, Rumelhart DE, the PDP Group, Parallel distributed processing, vol II. MIT Press, Cambridge, MA, pp 7–57

    Google Scholar 

  • Simpson P (ed) (1996) Neural networks. Theory, technology, and applications. IEEE Technology Update Series, New York

    MATH  Google Scholar 

  • Smyth P, Wolpert DH (1999) Linearly combining density estimators via stacking. Mach Learn 36:59–83

    Article  Google Scholar 

  • Srivastava S, Gupta MR, Frigyik BA (2007) Bayesian quadratic discriminant analysis. J Mach Learn Res 8:1277–1305

    MathSciNet  MATH  Google Scholar 

  • Stefanowski J, Nowaczyk S (2006) An experimental study of using rule induction algorithm in combiner multiple classifier. Int J Comput Intell Res 2. http://home.agh.edu.pl/~nowaczyk/research/IJCIR.pdf

  • Tastle WJ, Wierman MJ (2007) Consensus and dissention: a measure of ordinal dispersion. Int J Approx Reason 45(3):531–545

    Article  MathSciNet  MATH  Google Scholar 

  • Tastle WJ, Wierman MJ, Dumdum UR (2005) Ranking ordinal scales using the consensus measure. Issues Inf Syst VI(2):96–102

    Google Scholar 

  • Turney PD (1993) Robust classification with context-sensitive features. In: Proceedings of the sixth international conference on industrial and engineering applications of artificial intelligence and expert systems (IEA/AIE-93), Edinburgh

    Google Scholar 

  • Valentini G, Masulli F (2002) Ensembles of learning machines. In: Tagliaferri R, Marinaro M (eds) Neural nets, WIRN. Lecture Notes in Computer Science, vol 2486. Springer, Berlin, pp 3–19

    Google Scholar 

  • Wang PS (ed) (2010) Pattern recognition and machine vision. River Publishers, Aalborg

    Google Scholar 

  • Wernecke KD (1992) A coupling procedure for the discrimination of mixed data. Biometrics 48(2):497–506

    Article  MathSciNet  Google Scholar 

  • Witten IH, Frank E (2005) Data mining. Morgan Kaufmann, San Francisco

    MATH  Google Scholar 

  • Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259

    Article  MathSciNet  Google Scholar 

  • Woods K, Kegelmeyer WP, Bowyer K (1997) Combination of multiple classifiers using local accuracy estimates. IEEE Trans Pattern Anal Mach Intell 19:405–410

    Article  Google Scholar 

  • Zang H (2004) The optimality naive Bayes. Am Assoc Artif Intell, www.aaai.org

  • Zimmermann HJ (1996) Fuzzy set theory and its applications, 3rd edn. Kluwer, Boston/Dordrecht/London

    MATH  Google Scholar 

  • Zorkadis V, Karras DA, Panayotou M (2005) Efficient information theoretic strategies for classifier combination, feature extraction and performance evaluation in improving false positives and false negatives for spam e-mail filtering. Neural Netw 18(5–6), IJCNN 2005, July–Aug, pp 799–807

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Massimo Buscema .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Buscema, M., Tastle, W.J., Terzi, S. (2013). Meta Net: A New Meta-Classifier Family. In: Tastle, W. (eds) Data Mining Applications Using Artificial Adaptive Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-4223-3_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-4223-3_5

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-4222-6

  • Online ISBN: 978-1-4614-4223-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics