Advertisement

Instant Exceptional Model Mining Using Weighted Controlled Pattern Sampling

  • Sandy Moens
  • Mario Boley
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8819)

Abstract

When plugged into instant interactive data analytics processes, pattern mining algorithms are required to produce small collections of high quality patterns in short amounts of time. In the case of Exceptional Model Mining (EMM), even heuristic approaches like beam search can fail to deliver this requirement, because in EMM each search step requires a relatively expensive model induction. In this work, we extend previous work on high performance controlled pattern sampling by introducing extra weighting functionality, to give more importance to certain data records in a dataset. We use the extended framework to quickly obtain patterns that are likely to show highly deviating models. Additionally, we combine this randomized approach with a heuristic pruning procedure that optimizes the pattern quality further. Experiments show that in contrast to traditional beam search, this combined method is able to find higher quality patterns using short time budgets.

Keywords

Controlled Pattern Sampling Subgroup Discovery Exceptional Model Mining 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Hasan, M.A., Zaki, M.J.: Output space sampling for graph patterns. In: Proc. VLDB Endow, pp. 730–741 (2009)Google Scholar
  2. 2.
    Bache, K., Lichman, M.: UCI machine learning repository (2013)Google Scholar
  3. 3.
    Blumenstock, A., Hipp, J., Kempe, S., Lanquillon, C., Wirth, R.: Interactivity closes the gap. In: Proc. ACM SIGKDD 2006 Workshop on Data Mining for Business Applications (2006)Google Scholar
  4. 4.
    Boley, M., Lucchese, C., Paurat, D., Gärtner, T.: Direct local pattern sampling by efficient two–step random procedures. In: Proc. ACM SIGKDD 2011 (2011)Google Scholar
  5. 5.
    Boley, M., Mampaey, M., Kang, B., Tokmakov, P., Wrobel, S.: One click mining: Interactive local pattern discovery through implicit preference and performance learning. In: Proc. ACM SIGKDD 2013 Workshop IDEA, pp. 27–35. ACM (2013)Google Scholar
  6. 6.
    Boley, M., Moens, S., Gärtner, T.: Linear space direct pattern sampling using coupling from the past. In: Proc. ACM SIGKDD 2012, pp. 69–77. ACM (2012)Google Scholar
  7. 7.
    Chaoji, V., Hasan, M.A., Salem, S., Besson, J., Zaki, M.J.: Origami: A novel and effective approach for mining representative orthogonal graph patterns. In: Stat. Anal. Data Min., pp. 67–84 (2008)Google Scholar
  8. 8.
    Duivesteijn, W.: Exceptional model mining. PhD thesis, Leiden Institute of Advanced Computer Science (LIACS), Faculty of Science, Leiden University (2013)Google Scholar
  9. 9.
    Dzyuba, V., van Leeuwen, M.: Interactive discovery of interesting subgroup sets. In: Tucker, A., Höppner, F., Siebes, A., Swift, S. (eds.) IDA 2013. LNCS, vol. 8207, pp. 150–161. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  10. 10.
    Goethals, B., Moens, S., Vreeken, J.: Mime: a framework for interactive visual pattern mining. In: Proc. ACM SIGKDD 2011, pp. 757–760. ACM (2011)Google Scholar
  11. 11.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: An update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)CrossRefGoogle Scholar
  12. 12.
    Herrera, F., Carmona, C.J., González, P., del Jesus, M.J.: An overview on subgroup discovery: Foundations and applications. Knowl. Inf. Syst., 495–525 (2011)Google Scholar
  13. 13.
    Moens, S., Goethals, B.: Randomly sampling maximal itemsets. In: Proc. ACM SIGKDD 2013 Workshop IDEA, pp. 79–86 (2013)Google Scholar
  14. 14.
    Škrabal, R., Šimůnek, M., Vojíř, S., Hazucha, A., Marek, T., Chudán, D., Kliegr, T.: Association Rule Mining Following the Web Search Paradigm. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012, Part II. LNCS, vol. 7524, pp. 808–811. Springer, Heidelberg (2012)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Sandy Moens
    • 1
    • 2
  • Mario Boley
    • 2
    • 3
  1. 1.University of AntwerpBelgium
  2. 2.University of BonnGermany
  3. 3.Fraunhofer IAISGermany

Personalised recommendations