Improving Efficiency of Frequent Query Discovery by Eliminating Non-relevant Candidates

Maloberti, Jérôme; Suzuki, Einoshin

doi:10.1007/978-3-540-39644-4_19

Jérôme Maloberti^4,5 &
Einoshin Suzuki⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2843))

Included in the following conference series:

International Conference on Discovery Science

463 Accesses
2 Citations

Abstract

This paper presents, for Frequent Query Discovery (FQD), an algorithm which employs a novel relation of equivalence in order to remove redundant queries in the output. An FQD algorithm returns a set of frequent queries from a data base of query transactions in Datalog formalism. A Datalog data base can represent complex structures, such as hyper graphs, and allows the use of background knowledge. Thus, it is useful in complex domains such as chemistry and bio-informatics. A conventional FQD algorithm, such as WARMR, checks the redundancy of the queries with a relation of equivalence based on the θ-subsumption, which results in discovering a large set of frequent queries. In this work, we reduce the set of frequent queries using another relation of equivalence based on relevance of a query with respect to a data base. The experiments with both real and artificial data sets show that our algorithm is faster than WARMR and the test of relevance can remove up to 92% of the frequent queries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., et al.: Fast discovery of association rules. Advances in Knowledge Discovery and Data Mining, ch.12, pp. 307–328. AAAI/MIT Press, Menlo Park, Calif (1996)
Google Scholar
Blockeel, H., et al.: Executing query packs in ILP. In: Cussens, J., Frisch, A.M. (eds.) ILP 2000. LNCS (LNAI), vol. 1866, pp. 60–77. Springer, Heidelberg (2000)
Chapter Google Scholar
Dechter, R.: Constraint Networks. In: Encyclopedia of Artificial Intelligence, vol. 1, John Wiley & Sons, New York (1992)
Google Scholar
Dehaspe, L.: Frequent pattern discovery in first-order logic. PhD thesis, K. U. Leuven, Dept. of Computer Science (1998)
Google Scholar
Dehaspe, L., De Raedt, L.: Mining association rules in multiple relations. In: Džeroski, S., Lavrač, N. (eds.) ILP 1997. LNCS, vol. 1297, pp. 125–132. Springer, Heidelberg (1997)
Google Scholar
Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000)
Chapter Google Scholar
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Proc. International Conference on Data Mining, pp. 313–320. IEEE Computer Society, Los Alamitos (2001)
Chapter Google Scholar
Maloberti, J., Sebag, M.: Theta-subsumption in a constraint satisfaction perspective. In: Rouveirol, C., Sebag, M. (eds.) ILP 2001. LNCS (LNAI), vol. 2157, pp. 164–178. Springer, Heidelberg (2001)
Chapter Google Scholar
Nijssen, S., Kok, J.N.: Faster association rules for multiple relations. In: Proc. of the Seventeenth International Joint Conference on Artificial Intelligence, vol. 2, pp. 891–896. Morgan Kaufmann, San Francisco (2001)
Google Scholar
Plotkin, G.D.: A note on inductive generalization. In: Plotkin, G.D. (ed.) Machine Intelligence, vol. 5, pp. 153–163. Edinburgh University Press, Edinburgh (1970)
Google Scholar
Srinivasan, A., et al.: The predictive toxicology evaluation challenge. In: Proc. Fifteenth International Joint Conference on Artificial Intelligence (IJCAI 1997), pp. 1–6. Morgan-Kaufmann, San Francisco (1997)
Google Scholar
Ullman, J.D.: Principles of Database and Knowledge-Base Systems, vol. I. Computer Science Press, Rockville (1988)
Google Scholar
Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining. Technical Report UIUCDCS-R-2002-2296, Department of Computer Science, University of Illinois at Urbana-Champaign (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire de Recherche en Informatique (LRI), Université Paris-Sud, Bât 490, F-91405, Orsay Cedex, France
Jérôme Maloberti
Electrical and Computer Engineering, Yokohama National University, 79-5 Tokiwadai, Hodogaya, Yokohama, 240-8501, Japan
Jérôme Maloberti & Einoshin Suzuki

Authors

Jérôme Maloberti
View author publications
You can also search for this author in PubMed Google Scholar
Einoshin Suzuki
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

FG Knowledge Engineering, FB Informatik, Technical University Darmstadt, Hochschulstr. 10, 64289, Darmstadt
Gunter Grieser
Meme Media Laboratory, Hokkaido University, N13 W8, 0608628, Sapporo, Japan
Yuzuru Tanaka
Graduate School of Informatics, Kyoto University Yoshida Honmachi, Sakyo-ku, 606-850, Kyoto, Japan
Akihiro Yamamoto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Maloberti, J., Suzuki, E. (2003). Improving Efficiency of Frequent Query Discovery by Eliminating Non-relevant Candidates. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds) Discovery Science. DS 2003. Lecture Notes in Computer Science(), vol 2843. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39644-4_19

Download citation

DOI: https://doi.org/10.1007/978-3-540-39644-4_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20293-6
Online ISBN: 978-3-540-39644-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics