A Quantitative Comparison of the Subgraph Miners MoFa, gSpan, FFSM, and Gaston

Wörlein, Marc; Meinl, Thorsten; Fischer, Ingrid; Philippsen, Michael

doi:10.1007/11564126_39

A Quantitative Comparison of the Subgraph Miners MoFa, gSpan, FFSM, and Gaston

Marc Wörlein²³,
Thorsten Meinl²³,
Ingrid Fischer²³ &
…
Michael Philippsen²³

Conference paper

3559 Accesses
57 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3721))

Abstract

Several new miners for frequent subgraphs have been published recently. Whereas new approaches are presented in detail, the quantitative evaluations are often of limited value: only the performance on a small set of graph databases is discussed and the new algorithm is often only compared to a single competitor based on an executable. It remains unclear, how the algorithms work on bigger/other graph databases and which of their distinctive features is best suited for which database. We have re-implemented the subgraph miners MoFa, gSpan, FFSM, and Gaston within a common code base and with the same level of programming expertise and optimization effort. This paper presents the results of a comparative benchmarking that ran the algorithms on a comprehensive set of graph databases.

Download to read the full chapter text

Chapter PDF

References

Fischer, I., Meinl, T.: Subgraph Mining. In: Wang, J. (ed.) Encyclopedia of Data Warehousing and Mining. Idea Group Reference, Hershey, PA, USA (2005)
Google Scholar
Washio, T., Motoda, H.: State of the Art of Graph–based Data Mining. SIGKDD Explorations Newsletter 5, 59–68 (2003)
Article Google Scholar
McKay, B.: Practical graph isomorphism. Congressus Numerantium 30 (1981)
Google Scholar
Agrawal, R., Imielinski, T., Swami, A.N.: Mining Association Rules between Sets of Items in Large Databases. In: Buneman, P., Jajodia, S. (eds.) Proc. 1993 ACM SIGMOD Int’l Conf. on Management of Data, Washington, D.C., USA, pp. 207–216. ACM Press, New York (1993)
Chapter Google Scholar
Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: New Algorithms for Fast Discovery of Association Rules. In: Heckerman, D., Mannila, H., Pregibon, D., Uthurusamy, R., Park, M. (eds.) 3rd Int’l Conf. on Knowledge Discovery and Data Mining, pp. 283–296. AAAI Press, Menlo Park (1997)
Google Scholar
Cook, D.J., Holder, L.B.: Substructure Discovery Using Minimum Description Length and Background Knowledge. J. of Artificial Intelligence Research 1, 231–255 (1994)
Google Scholar
Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000)
Chapter Google Scholar
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Proceedings of the IEEE Intl. Conf. on Data Mining ICDM, Piscataway, NJ, USA, pp. 313–320. IEEE Press, Los Alamitos (2001)
Chapter Google Scholar
Borgelt, C., Berthold, M.R.: Mining Molecular Fragments: Finding Relevant Substructures of Molecules. In: Proc. IEEE Int’l Conf. on Data Mining ICDM, Maebashi City, Japan, pp. 51–58 (2002)
Google Scholar
Yan, X., Han, J.: gSpan: Graph–Based Substructure Pattern Mining. In: Proc. IEEE Int’l Conf. on Data Mining ICDM, Maebashi City, Japan, pp. 721–723 (2002)
Google Scholar
Huan, J., Wang, W., Prins, J.: Efficient mining of frequent subgraphs in the presence of isomorphism. In: Proceedings of the 3rd IEEE Intl. Conf. on Data Mining ICDM, Piscataway, NJ, USA, pp. 549–552. IEEE Press, Los Alamitos (2003)
Chapter Google Scholar
Nijssen, S., Kok, J.N.: Frequent Graph Mining and its Application to Molecular Databases. In: Thissen, W., Wieringa, P., Pantic, M., Ludema, M. (eds.) Proc. of the 2004 IEEE Conf. on Systems, Man and Cybernetics, SMC 2004, Den Haag, The Netherlands, pp. 4571–4577 (2004)
Google Scholar
Institute of Scientific Information, Inc. (ISI): Index chemicus - subset from 1993 (1993)
Google Scholar
Nijssen, S., Kok, J.N.: A quickstart in frequent structure mining can make a difference. Technical report, Leiden Institute of Advanced Computer Science, Leiden University (2004)
Google Scholar
Huan, J., Wang, W., Prins, J.: Efficient mining of frequent subgraphs in the presence of isomorphism. Technical report, Department of Computer Science at the University of North Carolina, Chapel Hill (2003)
Google Scholar
Srinivasan, A., King, R.D., Muggleton, S.H., Sternberg, M.: The predictive toxicology evaluation challenge. In: Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence (IJCAI 1997), pp. 1–6. Morgan-Kaufmann, San Francisco (1997)
Google Scholar
Yan, X., Han, J.: Closegraph: Mining Closed Frequent Graph Patterns. In: Proc. of the 9th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining, Washington, DC, USA, pp. 286–295. ACM Press, New York (2003)
Chapter Google Scholar
Meinl, T., Borgelt, C., Berthold, M.R.: Discriminative Closed Fragment Mining and Pefect Extensions in MoFa. In: Onaindia, E., Staab, S. (eds.) STAIRS 2004 - Proc. of the Second Starting AI Researchers’ Symp. Frontiers in Artificial Intelligence and Applications., Valencia, Spain, vol. 109, pp. 3–14. IOS Press, Amsterdam (2004)
Google Scholar
Hofer, H., Borgelt, C., Berthold, M.R.: Large Scale Mining of Molecular Fragments with Wildcards. In: R. Berthold, M., Lenz, H.-J., Bradley, E., Kruse, R., Borgelt, C. (eds.) IDA 2003. LNCS, vol. 2810, pp. 380–389. Springer, Heidelberg (2003)
Chapter Google Scholar
Meinl, T., Borgelt, C., Berthold, M.R.: Mining Fragments with Fuzzy Chains in Molecular Databases. In: Kok, J.N., Washio, T. (eds.) Proc. of the Workshop W7 on Mining Graphs, Trees and Sequences (MGTS 2004), Pisa, Italy, pp. 49–60 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department 2, University of Erlangen-Nuremberg, Martensstr. 3, 91058, Erlangen, Germany
Marc Wörlein, Thorsten Meinl, Ingrid Fischer & Michael Philippsen

Authors

Marc Wörlein
View author publications
You can also search for this author in PubMed Google Scholar
Thorsten Meinl
View author publications
You can also search for this author in PubMed Google Scholar
Ingrid Fischer
View author publications
You can also search for this author in PubMed Google Scholar
Michael Philippsen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

LIACC/FEP, Universidade do Porto, Portugal
Alípio Mário Jorge
LIAAD-INESC Porto LA / FEP, University of Porto, R. de Ceuta, 118, 6, 4050-190, Porto, Portugal
Luís Torgo
LIAAD-INESC Porto L.A./Faculty of Economics, University of Porto, Rua de Ceuta, 118-6, 4050-190, Porto, Portugal
Pavel Brazdil
Faculdade de Engenharia & LIAAD, Universidade do Porto, Portugal
Rui Camacho
Faculty of Economics of the University of Porto, Portugal
João Gama

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wörlein, M., Meinl, T., Fischer, I., Philippsen, M. (2005). A Quantitative Comparison of the Subgraph Miners MoFa, gSpan, FFSM, and Gaston. In: Jorge, A.M., Torgo, L., Brazdil, P., Camacho, R., Gama, J. (eds) Knowledge Discovery in Databases: PKDD 2005. PKDD 2005. Lecture Notes in Computer Science(), vol 3721. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564126_39

Download citation

DOI: https://doi.org/10.1007/11564126_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29244-9
Online ISBN: 978-3-540-31665-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics