Efficient Discovery of Generalized Sentinel Rules

  • Morten Middelfart
  • Torben Bach Pedersen
  • Jan Krogsgaard
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6262)

Abstract

This paper proposes the concept of generalized sentinel rules (sentinels) and presents an algorithm for their discovery. Sentinels represent schema level relationships between changes over time in certain measures in a multi-dimensional data cube. Sentinels notify users based on previous observations, e.g., that revenue might drop within two months if an increase in customer problems combined with a decrease in website traffic is observed. If the vice versa also holds, we have a bi-directional sentinel, which has a higher chance of being causal rather than coincidental. We significantly extend prior work to combine multiple measures into better sentinels as well as auto-fitting the best warning period. We introduce two novel quality measures, Balance and Score, that are used for selecting the best sentinels. We introduce an efficient algorithm incorporating novel optimization techniques. The algorithm is efficient and scales to very large datasets, which is verified by extensive experiments on both real and synthetic data. Moreover, we are able to discover strong and useful sentinels that could not be found when using sequential pattern mining or correlation techniques.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. of ACM SIGMOD, pp. 207–216 (1993)Google Scholar
  2. 2.
    Agrawal, R., Lin, K.I., Sawhney, H.S., Shim, K.: Fast similarity search in the presence of noise, scaling, and translation in timeseries databases. In: Proc. of VLDB, pp. 490–501 (1995)Google Scholar
  3. 3.
    Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules in Large Databases. In: Proc. of VLDB, pp. 487–499 (1994)Google Scholar
  4. 4.
    Agrawal, R., Srikant, R.: Mining Sequential Patterns. In: Proc. of ICDE, pp. 3–14 (1995)Google Scholar
  5. 5.
    Bosc, P., Pivert, O., Ughetto, L.: On Data Summaries Based on Gradual Rules. In: Proc. of Fuzzy Days, pp. 512–521 (1999)Google Scholar
  6. 6.
    Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic Itemset Counting and Implication Rules for Market Basket Data. In: Proc. of ACM SIGMOD, pp. 255–264 (1997)Google Scholar
  7. 7.
    Han, J., Kamber, M.: Data Mining Concepts and Techniques, 2nd edn. Morgan Kaufmann Publishers, San Francisco (2006)Google Scholar
  8. 8.
    Han, J., Pei, J., Mortazavi-Asl, B., Chen, Q., Dayal, U., Hsu, M.: FreeSpan: frequent pattern-projected sequential pattern mining. In: Proc. of KDD, pp. 355–359 (2000)Google Scholar
  9. 9.
    Lukatskii, A.M., Shapot, D.V.: Problems in multilinear programming. Computational Mathmatics and Mathmatical Physics 41(5), 638–648 (2001)MathSciNetGoogle Scholar
  10. 10.
    Middelfart, M.: CALM: Computer Aided Leadership & Management. iUniverse (2005)Google Scholar
  11. 11.
    Middelfart, M., Pedersen, T.B.: Discovering Sentinel Rules for Business Intelligence. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds.) DEXA 2009. LNCS, vol. 5690, pp. 592–602. Springer, Heidelberg (2009)Google Scholar
  12. 12.
    Middelfart, M., Pedersen, T.B.: Discovering Sentinel Rules for Business Intelligence. DB Tech. Report no. 24, http://dbtr.cs.aau.dk
  13. 13.
    Middelfart, M., Pedersen, T.B.: Implementing Sentinel Technology in the TARGIT BI Suite (in submission)Google Scholar
  14. 14.
    Nakagaito, F., Ozaki, T., Ohkawa, T.: Discovery of Quantitative Sequential Patterns from Event Sequences. In: Proc. of ICDM, pp. 31–36 (2009)Google Scholar
  15. 15.
    Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.C.: PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth. In: Proc. of ICDE, pp. 215–224 (2001)Google Scholar
  16. 16.
    Pei, J., Han, J., Mortazavi-Asl, B., Wang, J., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach. IEEE TKDE 16(11), 1424–1440 (2004)Google Scholar
  17. 17.
    Srikant, R., Agrawal, R.: Mining Sequential Patterns: Generalizations and Performance Improvements. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 3–17. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  18. 18.
    Yang, J., Wang, W., Yu, P.S., Han, J.: Mining long sequential patterns in a noisy environment. In: Proc. of ACM SIGMOD, pp. 406–417 (2002)Google Scholar
  19. 19.
    Zhu, Y., Shasha, D.: StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time. In: Proc. of VLDB, pp. 358–369 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Morten Middelfart
    • 1
  • Torben Bach Pedersen
    • 2
  • Jan Krogsgaard
    • 1
  1. 1.TARGIT A/S 
  2. 2.Department of Computer ScienceAalborg University 

Personalised recommendations