Towards Automatic Generation of Metafeatures

Pinto, Fábio; Soares, Carlos; Mendes-Moreira, João

doi:10.1007/978-3-319-31753-3_18

Fábio Pinto¹⁹,
Carlos Soares¹⁹ &
João Mendes-Moreira¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9651))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

2826 Accesses
17 Citations

Abstract

The selection of metafeatures for metalearning (MtL) is often an ad hoc process. The lack of a proper motivation for the choice of a metafeature rather than others is questionable and may originate a loss of valuable information for a given problem (e.g., use of class entropy and not attribute entropy). We present a framework to systematically generate metafeatures in the context of MtL. This framework decomposes a metafeature into three components: meta-function, object and post-processing. The automatic generation of metafeatures is triggered by the selection of a meta-function used to systematically generate metafeatures from all possible combinations of object and post-processing alternatives. We executed experiments by addressing the problem of algorithm selection in classification datasets. Results show that the sets of systematic metafeatures generated from our framework are more informative than the non-systematic ones and the set regarded as state-of-the-art.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The estimates of performance showed that NaiveBayes was better in 4 datasets, k-NN in 9, C5.0 in 23, CART in 2, SVM in 14 and Random Forest in 6.

References

Serban, F., Vanschoren, J., Kietz, J.U., Bernstein, A.: A survey of intelligent assistants for data analysis. ACM Comput. Surv. (CSUR) 45(3), 31 (2013)
Article Google Scholar
Brazdil, P., Carrier, C.G., Soares, C., Vilalta, R.: Metalearning: Applications to Data Mining. Springer, Heidelberg (2008)
MATH Google Scholar
Brazdil, P.B., Soares, C., Da Costa, J.P.: Ranking learning algorithms: using IBL and meta-learning on accuracy and time results. Mach. Learn. 50(3), 251–277 (2003)
Article MATH Google Scholar
Pfahringer, B., Bensusan, H., Giraud-Carrier, C.: Tell me who can learn you and I can tell you who you are: landmarking various learning algorithms. In: International Conference on Machine Learning, pp. 743–750 (2000)
Google Scholar
Sun, Q., Pfahringer, B.: Pairwise meta-rules for better meta-learning-based algorithm ranking. Mach. Learn. 93(1), 141–161 (2013)
Article MathSciNet MATH Google Scholar
Peng, Y.H., Flach, P.A., Soares, C., Brazdil, P.B.: Improved dataset characterisation for meta-learning. In: Lange, S., Satoh, K., Smith, C.H. (eds.) DS 2002. LNCS, vol. 2534, pp. 141–152. Springer, Heidelberg (2002)
Chapter Google Scholar
Prudêncio, R.B., Ludermir, T.B.: Meta-learning approaches to selecting time series models. Neurocomputing 61, 121–137 (2004)
Article Google Scholar
Reif, M., Shafait, F., Dengel, A.: Meta-learning for evolutionary parameter optimization of classifiers. Mach. Learn. 87(3), 357–380 (2012)
Article MathSciNet Google Scholar
Rossi, A.L.D., de Leon Ferreira, A.C.P., Soares, C., De Souza, B.F.: Metastream: a meta-learning based method for periodic algorithm selection in time-changing data. Neurocomputing 127, 52–64 (2014)
Article Google Scholar
van Rijn, J.N., Holmes, G., Pfahringer, B., Vanschoren, J.: Algorithm selection on data streams. In: Džeroski, S., Panov, P., Kocev, D., Todorovski, L. (eds.) DS 2014. LNCS, vol. 8777, pp. 325–336. Springer, Heidelberg (2014)
Google Scholar
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Google Scholar
Panov, P., Soldatova, L., Džeroski, S.: Ontology of core data mining entities. Data Min. Knowl. Disc. 28(5–6), 1222–1265 (2014)
Article MATH Google Scholar
Getoor, L., Mihalkova, L.: Learning statistical models from relational data. In: ACM SIGMOD International Conference on Management of Data, pp. 1195–1198. ACM (2011)
Google Scholar
Kalousis, A., Theoharis, T.: Noemon: design, implementation and performance results of an intelligent assistant for classifier selection. Intell. Data Anal. 3(5), 319–337 (1999)
Article MATH Google Scholar
Lichman, M.: UCI Machine Learning Repository. University of California, Irvine, School of Information and Computer Sciences (2013). http://archive.ics.uci.edu/ml
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
MathSciNet MATH Google Scholar
Robnik-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Mach. Learn. 53(1–2), 23–69 (2003)
Article MATH Google Scholar
Pinto, F., Soares, C., Mendes-Moreira, J.: Pruning bagging ensembles with metalearning. In: Schwenker, F., Roli, F., Kittler, J. (eds.) MCS 2015. LNCS, vol. 9132, pp. 64–75. Springer, Heidelberg (2015)
Chapter Google Scholar

Download references

Acknowledgments

This research has received funding from the ECSEL Joint Undertaking, the framework programme for research and innovation horizon 2020 (2014–2020) under grant agreement no. 662189-MANTIS-2014-1.

Author information

Authors and Affiliations

INESC TEC/Faculdade de Engenharia, Universidade do Porto, Rua Dr. Roberto Frias s/n, 4200-465, Porto, Portugal
Fábio Pinto, Carlos Soares & João Mendes-Moreira

Authors

Fábio Pinto
View author publications
You can also search for this author in PubMed Google Scholar
Carlos Soares
View author publications
You can also search for this author in PubMed Google Scholar
João Mendes-Moreira
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fábio Pinto .

Editor information

Editors and Affiliations

The University of Melbourne, Melbourne, Victoria, Australia
James Bailey
The University of Texas at Dallas, Richardson, Texas, USA
Latifur Khan
Osaka University, Osaka, Japan
Takashi Washio
University of Auckland, Auckland, New Zealand
Gill Dobbie
Shenzhen University, Shenzhen, China
Joshua Zhexue Huang
Massey University, Auckland, New Zealand
Ruili Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pinto, F., Soares, C., Mendes-Moreira, J. (2016). Towards Automatic Generation of Metafeatures. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J., Wang, R. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2016. Lecture Notes in Computer Science(), vol 9651. Springer, Cham. https://doi.org/10.1007/978-3-319-31753-3_18

Download citation

DOI: https://doi.org/10.1007/978-3-319-31753-3_18
Published: 12 April 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31752-6
Online ISBN: 978-3-319-31753-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics