Skip to main content

Towards Automatic Generation of Metafeatures

  • Conference paper
  • First Online:
Book cover Advances in Knowledge Discovery and Data Mining (PAKDD 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9651))

Included in the following conference series:

Abstract

The selection of metafeatures for metalearning (MtL) is often an ad hoc process. The lack of a proper motivation for the choice of a metafeature rather than others is questionable and may originate a loss of valuable information for a given problem (e.g., use of class entropy and not attribute entropy). We present a framework to systematically generate metafeatures in the context of MtL. This framework decomposes a metafeature into three components: meta-function, object and post-processing. The automatic generation of metafeatures is triggered by the selection of a meta-function used to systematically generate metafeatures from all possible combinations of object and post-processing alternatives. We executed experiments by addressing the problem of algorithm selection in classification datasets. Results show that the sets of systematic metafeatures generated from our framework are more informative than the non-systematic ones and the set regarded as state-of-the-art.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The estimates of performance showed that NaiveBayes was better in 4 datasets, k-NN in 9, C5.0 in 23, CART in 2, SVM in 14 and Random Forest in 6.

References

  1. Serban, F., Vanschoren, J., Kietz, J.U., Bernstein, A.: A survey of intelligent assistants for data analysis. ACM Comput. Surv. (CSUR) 45(3), 31 (2013)

    Article  Google Scholar 

  2. Brazdil, P., Carrier, C.G., Soares, C., Vilalta, R.: Metalearning: Applications to Data Mining. Springer, Heidelberg (2008)

    MATH  Google Scholar 

  3. Brazdil, P.B., Soares, C., Da Costa, J.P.: Ranking learning algorithms: using IBL and meta-learning on accuracy and time results. Mach. Learn. 50(3), 251–277 (2003)

    Article  MATH  Google Scholar 

  4. Pfahringer, B., Bensusan, H., Giraud-Carrier, C.: Tell me who can learn you and I can tell you who you are: landmarking various learning algorithms. In: International Conference on Machine Learning, pp. 743–750 (2000)

    Google Scholar 

  5. Sun, Q., Pfahringer, B.: Pairwise meta-rules for better meta-learning-based algorithm ranking. Mach. Learn. 93(1), 141–161 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  6. Peng, Y.H., Flach, P.A., Soares, C., Brazdil, P.B.: Improved dataset characterisation for meta-learning. In: Lange, S., Satoh, K., Smith, C.H. (eds.) DS 2002. LNCS, vol. 2534, pp. 141–152. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  7. Prudêncio, R.B., Ludermir, T.B.: Meta-learning approaches to selecting time series models. Neurocomputing 61, 121–137 (2004)

    Article  Google Scholar 

  8. Reif, M., Shafait, F., Dengel, A.: Meta-learning for evolutionary parameter optimization of classifiers. Mach. Learn. 87(3), 357–380 (2012)

    Article  MathSciNet  Google Scholar 

  9. Rossi, A.L.D., de Leon Ferreira, A.C.P., Soares, C., De Souza, B.F.: Metastream: a meta-learning based method for periodic algorithm selection in time-changing data. Neurocomputing 127, 52–64 (2014)

    Article  Google Scholar 

  10. van Rijn, J.N., Holmes, G., Pfahringer, B., Vanschoren, J.: Algorithm selection on data streams. In: Džeroski, S., Panov, P., Kocev, D., Todorovski, L. (eds.) DS 2014. LNCS, vol. 8777, pp. 325–336. Springer, Heidelberg (2014)

    Google Scholar 

  11. Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)

    Google Scholar 

  12. Panov, P., Soldatova, L., Džeroski, S.: Ontology of core data mining entities. Data Min. Knowl. Disc. 28(5–6), 1222–1265 (2014)

    Article  MATH  Google Scholar 

  13. Getoor, L., Mihalkova, L.: Learning statistical models from relational data. In: ACM SIGMOD International Conference on Management of Data, pp. 1195–1198. ACM (2011)

    Google Scholar 

  14. Kalousis, A., Theoharis, T.: Noemon: design, implementation and performance results of an intelligent assistant for classifier selection. Intell. Data Anal. 3(5), 319–337 (1999)

    Article  MATH  Google Scholar 

  15. Lichman, M.: UCI Machine Learning Repository. University of California, Irvine, School of Information and Computer Sciences (2013). http://archive.ics.uci.edu/ml

  16. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

  17. Robnik-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Mach. Learn. 53(1–2), 23–69 (2003)

    Article  MATH  Google Scholar 

  18. Pinto, F., Soares, C., Mendes-Moreira, J.: Pruning bagging ensembles with metalearning. In: Schwenker, F., Roli, F., Kittler, J. (eds.) MCS 2015. LNCS, vol. 9132, pp. 64–75. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

Download references

Acknowledgments

This research has received funding from the ECSEL Joint Undertaking, the framework programme for research and innovation horizon 2020 (2014–2020) under grant agreement no. 662189-MANTIS-2014-1.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fábio Pinto .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Pinto, F., Soares, C., Mendes-Moreira, J. (2016). Towards Automatic Generation of Metafeatures. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J., Wang, R. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2016. Lecture Notes in Computer Science(), vol 9651. Springer, Cham. https://doi.org/10.1007/978-3-319-31753-3_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-31753-3_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-31752-6

  • Online ISBN: 978-3-319-31753-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics