From Local Patterns to Classification Models

Bringmann, Björn; Nijssen, Siegfried; Zimmermann, Albrecht

doi:10.1007/978-1-4419-7738-0_6

Björn Bringmann⁴,
Siegfried Nijssen⁴ &
Albrecht Zimmermann⁴

656 Accesses
10 Citations

Abstract

Using pattern mining techniques for building a predictive model is currently a popular topic of research. The aim of these techniques is to obtain classifiers of better predictive performance as compared to greedily constructed models, as well as to allow the construction of predictive models for data not represented in attribute-value vectors. In this chapter we provide an overview of recent techniques we developed for integrating pattern mining and classification tasks. The range of techniques spans the entire range from approaches that select relevant patterns from a previously mined set for propositionalization of the data, over inducing patternbased rule sets, to algorithms that integrate pattern mining and model construction. We provide an overview of the algorithms which are most closely related to our approaches in order to put our techniques in a context.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Proceedings of the 5th IEEE International Conference on Data Mining (ICDM 2005), 27–30 November 2005, Houston, Texas, USA. IEEE Computer Society, 2005.
Google Scholar
Rakesh Agrawal, Heikki Mannila, Ramakrishnan Srikant, Hannu Toivonen, and A. Inkeri Verkamo. Fast discovery of association rules. In Advances in Knowledge Discovery and Data Mining, pages 307–328. AAAI/MIT Press, 1996.
Google Scholar
Stephen D. Bay and Michael J. Pazzani. Detecting change in categorical data: Mining contrast sets. In KDD, pages 302–306, 1999.
Google Scholar
Karsten Borgwardt, Xifeng Yan, Marisa Thoma, Hong Cheng, Arthur Gretton, Le Song, Alex Smola, Jiawei Han, Philip Yu, and Hans-Peter Kriegel. Combining near-optimal feature selection with gSpan. In Samuel Kaski, S.V.N. Vishwanathan, and Stefan Wrobel, editors, MLG, 2008.
Google Scholar
Björn Bringmann. Mining Patterns in Structured Data. PhD thesis, K.U.Leuven, September 2009. De Raedt, Luc (supervisor).
Google Scholar
Björn Bringmann and Albrecht Zimmermann. Tree² - decision trees for tree structured data. In Alípio Jorge, Luís Torgo, Pavel Brazdil, Rui Camacho, and João Gama, editors, PKDD, volume 3721 of Lecture Notes in Computer Science, pages 46–58. Springer, 2005.
Google Scholar
Björn Bringmann and Albrecht Zimmermann. One in a million: picking the right patterns. Knowl. Inf. Syst., 18(1):61–81, 2009.
Article Google Scholar
Björn Bringmann, Albrecht Zimmermann, Luc De Raedt, and Siegfried Nijssen. Don’t be afraid of simpler patterns. In Fürnkranz et al. [17], pages 55–66.
Google Scholar
Loïc Cerf, Dominique Gay, Nazha Selmaoui, and Jean-François Boulicaut. A parameter-free associative classification method. In Il-Yeol Song, Johann Eder, and Tho Manh Nguyen, editors, DaWaK, volume 5182 of Lecture Notes in Computer Science, pages 293–304. Springer, 2008.
Google Scholar
Hong Cheng, Xifeng Yan, Jiawei Han, and Chih-Wei Hsu. Discriminative frequent pattern analysis for effective classification. In ICDE, pages 716–725. IEEE, 2007.
Google Scholar
Hong Cheng, Xifeng Yan, Jiawei Han, and Philip S. Yu. Direct discriminative pattern mining for effective classification. In ICDE, pages 169–178. IEEE, 2008.
Google Scholar
Frans Coenen and Paul Leng. Obtaining best parameter values for accurate classification. In ICDM [1], pages 597–600.
Google Scholar
William W. Cohen. Fast effective rule induction. In In Proceedings ofthe Twelfth International Conference on Machine Learning, pages 115–123. Morgan Kaufmann, 1995.
Google Scholar
Mukund Deshpande and George Karypis. Using conjunction of attribute values for classifica- tion. In CIKM, pages 356–364. ACM, 2002.
Google Scholar
Mukund Deshpande, Michihiro Kuramochi, Nikil Wale, and George Karypis. Frequent substructure-based approaches for classifying chemical compounds. IEEE Trans. Knowl. Data Eng., 17(8):1036–1050, 2005.
Article Google Scholar
Guozhu Dong and Jinyan Li. Efficient mining of emerging patterns: Discovering trends and differences. In KDD, pages 43–52, 1999.
Google Scholar
Johannes Fürnkranz, Tobias Scheffer, and Myra Spiliopoulou, editors. Knowledge Discovery in Databases: PKDD 2006, 10th European Conference on Principles and Practice of Knowledge Discovery in Databases, Berlin, Germany, September 18–22, 2006, Proceedings, volume 4213 of Lecture Notes in Computer Science. Springer, 2006.
Google Scholar
Gemma C. Garriga, Petra Kralj, and Nada Lavrac. Closed sets for labeled data. In Fürnkranz etal. [17], pages 163–174.
Google Scholar
Jiawei Han, Jian Pei, and Yiwen Yin. Mining frequent patterns without candidate generation. In Weidong Chen, Jeffrey F. Naughton, and Philip A. Bernstein, editors, SIGMOD Conference, pages 1–12. ACM, 2000.
Google Scholar
Jeroen Kazius, Siegfried Nijssen, Joost N. Kok, Thomas Bäck, and Adriaan P. IJzerman. Sub- structure mining using elaborate chemical representation. Journal of Chemical Information and Modeling, 46(2):597–605, 2006.
Article Google Scholar
Arno Knobbe, Bruno Crémilleux, Johannes Fürnkranz, and Martin Scholz. From local patterns to global models: the LeGo approach to data mining. In Johannes Fürnkranz and Arno Knobbe, editors, LeGo’08, Proceedings of the ECML PKDD 2008 Workshop ’From Local Patterns to Global Models’, pages 1–16, 2008.
Google Scholar
Arno J. Knobbe and Eric K. Y. Ho. Maximally informative k-itemsets and their efficient discovery. In Tina Eliassi-Rad, Lyle H. Ungar, Mark Craven, and Dimitrios Gunopulos, editors, KDD, pages 237–244. ACM, 2006.
Google Scholar
Arno J. Knobbe and Eric K. Y. Ho. Pattern teams. In Fürnkranz et al. [17], pages 577–584.
Google Scholar
Stefan Kramer and Luc De Raedt. Feature construction with version spaces for biochemical applications. In Carla E. Brodley and Andrea Pohoreckyj Danyluk, editors, ICML, pages 258–265. Morgan Kaufmann, 2001.
Google Scholar
Neal Lesh, Mohammed Javeed Zaki, and Mitsunori Ogihara. Mining features for sequence classification. In KDD, pages 342–346, 1999.
Google Scholar
Wenmin Li, Jiawei Han, and Jian Pei. Cmar: Accurate and efficient classification based on multiple class-association rules. In Nick Cercone, Tsau Young Lin, and Xindong Wu, editors, ICDM, pages 369–376. IEEE Computer Society, 2001.
Google Scholar
Bing Liu, Wynne Hsu, and Yiming Ma. Integrating classification and association rule mining. In KDD, pages 80–86, 1998.
Google Scholar
T.M. Mitchell. Machine Learning. McGraw-Hill, New York, 1997.
MATH Google Scholar
Shinichi Morishita and Jun Sese. Traversing itemset lattice with statistical metric pruning. In PODS, pages 226–236. ACM, 2000.
Google Scholar
G. Nemhauser, L. Wolsey, and M. Fisher. An analysis of the approximations for maximizing submodular set functions. Mathematical Programming, 14:265–294, 1978.
Article MATH MathSciNet Google Scholar
Siegfried Nijssen and Élisa Fromont. Mining optimal decision trees from itemset lattices. In Pavel Berkhin, Rich Caruana, and Xindong Wu, editors, KDD, pages 530–539. ACM, 2007.
Google Scholar
Siegfried Nijssen and Elisa Fromont. Optimal constraint-based decision tree induction from itemset lattices. Data Mining and Knowledge Discovery, 2010. (In press).
Google Scholar
Siegfried Nijssen and Joost N. Kok. Multi-class correlated pattern mining. In Francesco Bonchi and Jean-Francois Boulicaut, editors, KDID, volume 3933 of Lecture Notes in Computer Science, pages 165–187. Springer, 2005.
Google Scholar
J. Ross Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
Google Scholar
Luc De Raedt and Albrecht Zimmermann. Constraint-based pattern set mining. In SDM. SIAM, 2007.
Google Scholar
Geoffrey I. Webb. Layered critical values: a powerful direct-adjustment approach to discovering significant patterns. Machine Learning, 71(2–3):307–323, 2008.
Article Google Scholar
Stefan Wrobel. An algorithm for multi-relational discovery of subgroups. In Henryk Jan Komorowski and Jan M. Zytkow, editors, PKDD, volume 1263 of Lecture Notes in Computer Science, pages 78–87. Springer, 1997.
Google Scholar
Mohammed Javeed Zaki, Srinivasan Parthasarathy, Mitsunori Ogihara, and Wei Li. New algorithms for fast discovery of association rules. In KDD, pages 283–286, 1997.
Google Scholar
Albrecht Zimmermann and Björn Bringmann. CTC - correlating tree patterns for classification. In ICDM [1], pages 833–836.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Katholieke Universiteit Leuven, Celestijnenlaan 200A, 3001, Leuven, Belgium
Björn Bringmann, Siegfried Nijssen & Albrecht Zimmermann

Authors

Björn Bringmann
View author publications
You can also search for this author in PubMed Google Scholar
Siegfried Nijssen
View author publications
You can also search for this author in PubMed Google Scholar
Albrecht Zimmermann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Björn Bringmann .

Editor information

Editors and Affiliations

, Department of Knowledge Technologies, Jozef Stefan Institute, Jamova 39, Ljubljana, 1000, Slovenia
Sašo Džeroski
, Mathematics and Computer Science, University of Antwerp, Middelheimlaan 1, Antwerpen, B-2020, Belgium
Bart Goethals
, Dept. of Knowledge Technologies, Jožef Stefan Institute, Jamova cesta 39, Ljubljana, SI-1000, Slovenia
Panče Panov

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bringmann, B., Nijssen, S., Zimmermann, A. (2010). From Local Patterns to Classification Models. In: Džeroski, S., Goethals, B., Panov, P. (eds) Inductive Databases and Constraint-Based Data Mining. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-7738-0_6

Download citation

DOI: https://doi.org/10.1007/978-1-4419-7738-0_6
Published: 18 November 2010
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-7737-3
Online ISBN: 978-1-4419-7738-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics