Skip to main content

Instant Exceptional Model Mining Using Weighted Controlled Pattern Sampling

  • Conference paper
Advances in Intelligent Data Analysis XIII (IDA 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8819))

Included in the following conference series:

Abstract

When plugged into instant interactive data analytics processes, pattern mining algorithms are required to produce small collections of high quality patterns in short amounts of time. In the case of Exceptional Model Mining (EMM), even heuristic approaches like beam search can fail to deliver this requirement, because in EMM each search step requires a relatively expensive model induction. In this work, we extend previous work on high performance controlled pattern sampling by introducing extra weighting functionality, to give more importance to certain data records in a dataset. We use the extended framework to quickly obtain patterns that are likely to show highly deviating models. Additionally, we combine this randomized approach with a heuristic pruning procedure that optimizes the pattern quality further. Experiments show that in contrast to traditional beam search, this combined method is able to find higher quality patterns using short time budgets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hasan, M.A., Zaki, M.J.: Output space sampling for graph patterns. In: Proc. VLDB Endow, pp. 730–741 (2009)

    Google Scholar 

  2. Bache, K., Lichman, M.: UCI machine learning repository (2013)

    Google Scholar 

  3. Blumenstock, A., Hipp, J., Kempe, S., Lanquillon, C., Wirth, R.: Interactivity closes the gap. In: Proc. ACM SIGKDD 2006 Workshop on Data Mining for Business Applications (2006)

    Google Scholar 

  4. Boley, M., Lucchese, C., Paurat, D., Gärtner, T.: Direct local pattern sampling by efficient two–step random procedures. In: Proc. ACM SIGKDD 2011 (2011)

    Google Scholar 

  5. Boley, M., Mampaey, M., Kang, B., Tokmakov, P., Wrobel, S.: One click mining: Interactive local pattern discovery through implicit preference and performance learning. In: Proc. ACM SIGKDD 2013 Workshop IDEA, pp. 27–35. ACM (2013)

    Google Scholar 

  6. Boley, M., Moens, S., Gärtner, T.: Linear space direct pattern sampling using coupling from the past. In: Proc. ACM SIGKDD 2012, pp. 69–77. ACM (2012)

    Google Scholar 

  7. Chaoji, V., Hasan, M.A., Salem, S., Besson, J., Zaki, M.J.: Origami: A novel and effective approach for mining representative orthogonal graph patterns. In: Stat. Anal. Data Min., pp. 67–84 (2008)

    Google Scholar 

  8. Duivesteijn, W.: Exceptional model mining. PhD thesis, Leiden Institute of Advanced Computer Science (LIACS), Faculty of Science, Leiden University (2013)

    Google Scholar 

  9. Dzyuba, V., van Leeuwen, M.: Interactive discovery of interesting subgroup sets. In: Tucker, A., Höppner, F., Siebes, A., Swift, S. (eds.) IDA 2013. LNCS, vol. 8207, pp. 150–161. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  10. Goethals, B., Moens, S., Vreeken, J.: Mime: a framework for interactive visual pattern mining. In: Proc. ACM SIGKDD 2011, pp. 757–760. ACM (2011)

    Google Scholar 

  11. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: An update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)

    Article  Google Scholar 

  12. Herrera, F., Carmona, C.J., González, P., del Jesus, M.J.: An overview on subgroup discovery: Foundations and applications. Knowl. Inf. Syst., 495–525 (2011)

    Google Scholar 

  13. Moens, S., Goethals, B.: Randomly sampling maximal itemsets. In: Proc. ACM SIGKDD 2013 Workshop IDEA, pp. 79–86 (2013)

    Google Scholar 

  14. Škrabal, R., Šimůnek, M., Vojíř, S., Hazucha, A., Marek, T., Chudán, D., Kliegr, T.: Association Rule Mining Following the Web Search Paradigm. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012, Part II. LNCS, vol. 7524, pp. 808–811. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Moens, S., Boley, M. (2014). Instant Exceptional Model Mining Using Weighted Controlled Pattern Sampling. In: Blockeel, H., van Leeuwen, M., Vinciotti, V. (eds) Advances in Intelligent Data Analysis XIII. IDA 2014. Lecture Notes in Computer Science, vol 8819. Springer, Cham. https://doi.org/10.1007/978-3-319-12571-8_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-12571-8_18

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-12570-1

  • Online ISBN: 978-3-319-12571-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics