Abstract
This paper presents, for Frequent Query Discovery (FQD), an algorithm which employs a novel relation of equivalence in order to remove redundant queries in the output. An FQD algorithm returns a set of frequent queries from a data base of query transactions in Datalog formalism. A Datalog data base can represent complex structures, such as hyper graphs, and allows the use of background knowledge. Thus, it is useful in complex domains such as chemistry and bio-informatics. A conventional FQD algorithm, such as WARMR, checks the redundancy of the queries with a relation of equivalence based on the θ-subsumption, which results in discovering a large set of frequent queries. In this work, we reduce the set of frequent queries using another relation of equivalence based on relevance of a query with respect to a data base. The experiments with both real and artificial data sets show that our algorithm is faster than WARMR and the test of relevance can remove up to 92% of the frequent queries.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., et al.: Fast discovery of association rules. Advances in Knowledge Discovery and Data Mining, ch.12, pp. 307–328. AAAI/MIT Press, Menlo Park, Calif (1996)
Blockeel, H., et al.: Executing query packs in ILP. In: Cussens, J., Frisch, A.M. (eds.) ILP 2000. LNCS (LNAI), vol. 1866, pp. 60–77. Springer, Heidelberg (2000)
Dechter, R.: Constraint Networks. In: Encyclopedia of Artificial Intelligence, vol. 1, John Wiley & Sons, New York (1992)
Dehaspe, L.: Frequent pattern discovery in first-order logic. PhD thesis, K. U. Leuven, Dept. of Computer Science (1998)
Dehaspe, L., De Raedt, L.: Mining association rules in multiple relations. In: Džeroski, S., Lavrač, N. (eds.) ILP 1997. LNCS, vol. 1297, pp. 125–132. Springer, Heidelberg (1997)
Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000)
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Proc. International Conference on Data Mining, pp. 313–320. IEEE Computer Society, Los Alamitos (2001)
Maloberti, J., Sebag, M.: Theta-subsumption in a constraint satisfaction perspective. In: Rouveirol, C., Sebag, M. (eds.) ILP 2001. LNCS (LNAI), vol. 2157, pp. 164–178. Springer, Heidelberg (2001)
Nijssen, S., Kok, J.N.: Faster association rules for multiple relations. In: Proc. of the Seventeenth International Joint Conference on Artificial Intelligence, vol. 2, pp. 891–896. Morgan Kaufmann, San Francisco (2001)
Plotkin, G.D.: A note on inductive generalization. In: Plotkin, G.D. (ed.) Machine Intelligence, vol. 5, pp. 153–163. Edinburgh University Press, Edinburgh (1970)
Srinivasan, A., et al.: The predictive toxicology evaluation challenge. In: Proc. Fifteenth International Joint Conference on Artificial Intelligence (IJCAI 1997), pp. 1–6. Morgan-Kaufmann, San Francisco (1997)
Ullman, J.D.: Principles of Database and Knowledge-Base Systems, vol. I. Computer Science Press, Rockville (1988)
Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining. Technical Report UIUCDCS-R-2002-2296, Department of Computer Science, University of Illinois at Urbana-Champaign (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Maloberti, J., Suzuki, E. (2003). Improving Efficiency of Frequent Query Discovery by Eliminating Non-relevant Candidates. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds) Discovery Science. DS 2003. Lecture Notes in Computer Science(), vol 2843. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39644-4_19
Download citation
DOI: https://doi.org/10.1007/978-3-540-39644-4_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20293-6
Online ISBN: 978-3-540-39644-4
eBook Packages: Springer Book Archive