Skip to main content

Improving Efficiency of Frequent Query Discovery by Eliminating Non-relevant Candidates

  • Conference paper
Discovery Science (DS 2003)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2843))

Included in the following conference series:

Abstract

This paper presents, for Frequent Query Discovery (FQD), an algorithm which employs a novel relation of equivalence in order to remove redundant queries in the output. An FQD algorithm returns a set of frequent queries from a data base of query transactions in Datalog formalism. A Datalog data base can represent complex structures, such as hyper graphs, and allows the use of background knowledge. Thus, it is useful in complex domains such as chemistry and bio-informatics. A conventional FQD algorithm, such as WARMR, checks the redundancy of the queries with a relation of equivalence based on the θ-subsumption, which results in discovering a large set of frequent queries. In this work, we reduce the set of frequent queries using another relation of equivalence based on relevance of a query with respect to a data base. The experiments with both real and artificial data sets show that our algorithm is faster than WARMR and the test of relevance can remove up to 92% of the frequent queries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., et al.: Fast discovery of association rules. Advances in Knowledge Discovery and Data Mining, ch.12, pp. 307–328. AAAI/MIT Press, Menlo Park, Calif (1996)

    Google Scholar 

  2. Blockeel, H., et al.: Executing query packs in ILP. In: Cussens, J., Frisch, A.M. (eds.) ILP 2000. LNCS (LNAI), vol. 1866, pp. 60–77. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  3. Dechter, R.: Constraint Networks. In: Encyclopedia of Artificial Intelligence, vol. 1, John Wiley & Sons, New York (1992)

    Google Scholar 

  4. Dehaspe, L.: Frequent pattern discovery in first-order logic. PhD thesis, K. U. Leuven, Dept. of Computer Science (1998)

    Google Scholar 

  5. Dehaspe, L., De Raedt, L.: Mining association rules in multiple relations. In: Džeroski, S., Lavrač, N. (eds.) ILP 1997. LNCS, vol. 1297, pp. 125–132. Springer, Heidelberg (1997)

    Google Scholar 

  6. Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  7. Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Proc. International Conference on Data Mining, pp. 313–320. IEEE Computer Society, Los Alamitos (2001)

    Chapter  Google Scholar 

  8. Maloberti, J., Sebag, M.: Theta-subsumption in a constraint satisfaction perspective. In: Rouveirol, C., Sebag, M. (eds.) ILP 2001. LNCS (LNAI), vol. 2157, pp. 164–178. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  9. Nijssen, S., Kok, J.N.: Faster association rules for multiple relations. In: Proc. of the Seventeenth International Joint Conference on Artificial Intelligence, vol. 2, pp. 891–896. Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

  10. Plotkin, G.D.: A note on inductive generalization. In: Plotkin, G.D. (ed.) Machine Intelligence, vol. 5, pp. 153–163. Edinburgh University Press, Edinburgh (1970)

    Google Scholar 

  11. Srinivasan, A., et al.: The predictive toxicology evaluation challenge. In: Proc. Fifteenth International Joint Conference on Artificial Intelligence (IJCAI 1997), pp. 1–6. Morgan-Kaufmann, San Francisco (1997)

    Google Scholar 

  12. Ullman, J.D.: Principles of Database and Knowledge-Base Systems, vol. I. Computer Science Press, Rockville (1988)

    Google Scholar 

  13. Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining. Technical Report UIUCDCS-R-2002-2296, Department of Computer Science, University of Illinois at Urbana-Champaign (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Maloberti, J., Suzuki, E. (2003). Improving Efficiency of Frequent Query Discovery by Eliminating Non-relevant Candidates. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds) Discovery Science. DS 2003. Lecture Notes in Computer Science(), vol 2843. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39644-4_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-39644-4_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20293-6

  • Online ISBN: 978-3-540-39644-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics