Skip to main content

Constraint Relaxations for Discovering Unknown Sequential Patterns

  • Conference paper
Knowledge Discovery in Inductive Databases (KDID 2004)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3377))

Included in the following conference series:

Abstract

The main drawbacks of sequential pattern mining have been its lack of focus on user expectations and the high number of discovered patterns. However, the solution commonly accepted – the use of constraints – approximates the mining process to a verification of what are the frequent patterns among the specified ones, instead of the discovery of unknown and unexpected patterns.

In this paper, we propose a new methodology to mine sequential patterns, keeping the focus on user expectations, without compromising the discovery of unknown patterns. Our methodology is based on the use of constraint relaxations, and it consists on using them to filter accepted patterns during the mining process. We propose a hierarchy of relaxations, applied to constraints expressed as context-free languages, classifying the existing relaxations (legal, valid and naïve, previously proposed), and proposing several new classes of relaxations. The new classes range from the approx and non-accepted, to the composition of different types of relaxations, like the approx-legal or the non-prefix-valid relaxations. Finally, we present a case study that shows the results achieved with the application of this methodology to the analysis of the curricular sequences of computer science students.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Antunes, C., Oliveira, A.L.: Inference of Sequential Association Rules Guided by Context-Free Grammars. In: Int. Conf. Grammatical Inference, pp. 1–13. Springer, Heidelberg (2002)

    Google Scholar 

  2. Antunes, C., Oliveira, A.L.: Sequential Pattern Mining with Approximated Constraints. In: Int. Conf. Applied Computing, IADIS, pp. 131–138 (2004)

    Google Scholar 

  3. Garofalakis, M., Rastogi, R., Shim, K.: SPIRIT: Sequential Pattern Mining with Regular Expression Constraint. In: Int. Conf. Very Large Databases, pp. 223–234. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

  4. Hilderman, R., Hamilton, H.: Knowledge discovery and interestingness measures: a survey, Technical Report CS 99-04, Dep. Computer Science, University of Regina (1999)

    Google Scholar 

  5. Hipp, J., Güntzer, U.: Is pushing constraints deeply into the mining algorithms really what we want? SIGKDD Explorations 4(1), 50–55 (2002)

    Article  Google Scholar 

  6. Hopcroft, J., Ullman, J.: Introduction to Automata Theory, Languages and Computation. Addison-Wesley, Reading (1979)

    MATH  Google Scholar 

  7. Kum, H.-C., Pei, J., Wang, W., Duncan, D.: ApproxMAP: Approximate Mining of Consensus Sequential Patterns. In: Int. Conf. on Data Mining. IEEE, Los Alamitos (2003)

    Google Scholar 

  8. Levenshtein, V.: Binary Codes capable of correcting spurious insertions and deletions of ones. In: Problems of Information Transmission, pp. 8–17. Kluwer, Dordrecht (1965)

    Google Scholar 

  9. Pei, J., Han, J., et al.: PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In: Int. Conf. Data Engineering, pp. 215–226. IEEE, Los Alamitos (2001)

    Google Scholar 

  10. Pei, J., Han, J., Wang, W.: Mining Sequential Patterns with Constraints in Large Databases. In: Conf Information and Knowledge Management, pp. 18–25. ACM, New York (2002)

    Google Scholar 

  11. Srikant, R., Agrawal, R.: Mining Sequential Patterns: Generalizations and Performance Improvements. In: Int. Conf Extending Database Technology, pp. 3–17. Springer, Heidelberg (1996)

    Google Scholar 

  12. Srikant, R., Agrawal, R.: Mining association rules with item constraints. In: Int. Conf. Knowledge Discovery and Data Mining, pp. 67–73. ACM, New York (1997)

    Google Scholar 

  13. Zaki, M.: Efficient Enumeration of Frequent Sequences. In: Int. Conf. Information and Knowledge Management, pp. 68–75. ACM, New York (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Antunes, C., Oliveira, A.L. (2005). Constraint Relaxations for Discovering Unknown Sequential Patterns. In: Goethals, B., Siebes, A. (eds) Knowledge Discovery in Inductive Databases. KDID 2004. Lecture Notes in Computer Science, vol 3377. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31841-5_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-31841-5_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25082-1

  • Online ISBN: 978-3-540-31841-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics