Mining Frequent Connected Subgraphs Reducing the Number of Candidates

Gago Alonso, Andrés; Medina Pagola, José Eladio; Carrasco-Ochoa, Jesús Ariel; Martínez-Trinidad, José Fco.

doi:10.1007/978-3-540-87479-9_42

Andrés Gago Alonso^1,2,
José Eladio Medina Pagola¹,
Jesús Ariel Carrasco-Ochoa² &
…
José Fco. Martínez-Trinidad²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5211))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

5680 Accesses
6 Citations

Abstract

In this paper, a new algorithm for mining frequent connected subgraphs called gRed (graph Candidate Reduction Miner) is presented. This algorithm is based on the gSpan algorithm proposed by Yan and Jan. In this method, the mining process is optimized introducing new heuristics to reduce the number of candidates. The performance of gRed is compared against two of the most popular and efficient algorithms available in the literature (gSpan and Gaston). The experimentation on real world databases shows the performance of our proposal overcoming gSpan, and achieving better performance than Gaston for low minimal support when databases are large.

Download to read the full chapter text

Chapter PDF

Optimized Candidate Generation for Frequent Subgraph Mining in a Single Graph

WFSM-MaxPWS: An Efficient Approach for Mining Weighted Frequent Subgraphs from Edge-Weighted Graph Databases

TKG: Efficient Mining of Top-K Frequent Subgraphs

References

Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proceedings of the 1994 International Conference on Very Large Data Bases (VLDB 1994), Santiago, Chile, pp. 487–499 (1994)
Google Scholar
Borgelt, C., Berthold, M.R.: Mining Molecular Fragments: Finding Relevant Substructures of Molecules. In: Proceedings of the 2002 International Conference on Data Mining (ICDM 2002), Maebashi, Japan, pp. 211–218 (2002)
Google Scholar
Han, J., Cheng, H., Xin, D., Yan, X.: Frequent Pattern Mining: Current Status and Future Directions. In: Data Mining and Knowledge Discovery (DMKD 2007), 10th Anniversary Issue, vol. 15(1), pp. 55–86 (2007)
Google Scholar
Huan, J., Wang, W., Prins, J.: Efficient Mining of Frequent Subgraph in the Presence of Isomorphism. In: Proceedings of the 2003 International Conference on Data Mining (ICDM 2003), Melbourne, FL, pp. 549–552 (2003)
Google Scholar
Inokuchi, A., Washio, T., Motoda, H.: An Apriori-based Algorithm for Mining Frequent Substructures from Graph Data. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000)
Chapter Google Scholar
Inokuchi, A., Washio, T.: Nishimura and K., Motoda, H.: A Fast Algorithm for Mining Frequent Connected Subgraphs, Technical Report RT0448. In IBM Research, Tokyo Research Laboratory, pp. 10 (2002)
Google Scholar
Kuramochi, M., Karypis, G.: Frequent Subgraph Discovery. In: Proceedings of the 2001 International Conference on Data Mining (ICDM 2001), San Jose, CA, pp. 313–320 (2001)
Google Scholar
Nijssen, S., Kok, J.: A Quickstart in Frequent Structure Mining can Make a Difference. In: Proceedings of the 2004 ACM SIGKDD International Conference on Kowledge Discovery in Databases (KDD 2004), Seattle, WA, pp. 647–352 (2004)
Google Scholar
Srinivasan, A., King, R.D., Muggleton, S.H., Sternberg, M.: The Predictive Toxicologic Evaluation Challenge. In: Proceedings of the 15th International Conference on Artificial Intelligence (IJCAI 1997), pp. 1–6. Morgan-Kaufmann, San Francisco (1997)
Google Scholar
Wörlein, M., Meinl, T., Fischer, I., Philippsen, M.: A Quantitative Comparison of the Subgraph Miners MoFa, gSpan, FFSM, and Gaston. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 392–403. Springer, Heidelberg (2005)
Chapter Google Scholar
Yan, X., Han, J.: gSpan: Graph-Based Substructure Pattern Mining. In: Proceedings of the 2002 International Conference on Data Mining (ICDM 2002), Maebashi, Japan, pp. 721–724 (2002)
Google Scholar
Yan, X., Han, J.: gSpan: Graph-Based Substructure Pattern Mining, Expanded Version, UIUC Technical Report, UIUCDCS-R-2002-2296 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Data Mining Departament, Advanced Technologies Application Center (CENATAV), 7a # 21812 e/ 218 y 222, Rpto. Siboney, Playa, CP: 12200, La Habana, Cuba
Andrés Gago Alonso & José Eladio Medina Pagola
National Institute of Astrophysics, Optics and Electronics (INAOE), Luis Enrique Erro No. 1, Sta. María Tonantzintla, Puebla, CP: 72840, Mexico
Andrés Gago Alonso, Jesús Ariel Carrasco-Ochoa & José Fco. Martínez-Trinidad

Authors

Andrés Gago Alonso
View author publications
You can also search for this author in PubMed Google Scholar
José Eladio Medina Pagola
View author publications
You can also search for this author in PubMed Google Scholar
Jesús Ariel Carrasco-Ochoa
View author publications
You can also search for this author in PubMed Google Scholar
José Fco. Martínez-Trinidad
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Walter Daelemans Bart Goethals Katharina Morik

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gago Alonso, A., Medina Pagola, J.E., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F. (2008). Mining Frequent Connected Subgraphs Reducing the Number of Candidates. In: Daelemans, W., Goethals, B., Morik, K. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2008. Lecture Notes in Computer Science(), vol 5211. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87479-9_42

Download citation

DOI: https://doi.org/10.1007/978-3-540-87479-9_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87478-2
Online ISBN: 978-3-540-87479-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Mining Frequent Connected Subgraphs Reducing the Number of Candidates

Abstract

Chapter PDF

Similar content being viewed by others

Optimized Candidate Generation for Frequent Subgraph Mining in a Single Graph

WFSM-MaxPWS: An Efficient Approach for Mining Weighted Frequent Subgraphs from Edge-Weighted Graph Databases

TKG: Efficient Mining of Top-K Frequent Subgraphs

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Mining Frequent Connected Subgraphs Reducing the Number of Candidates

Abstract

Chapter PDF

Similar content being viewed by others

Optimized Candidate Generation for Frequent Subgraph Mining in a Single Graph

WFSM-MaxPWS: An Efficient Approach for Mining Weighted Frequent Subgraphs from Edge-Weighted Graph Databases

TKG: Efficient Mining of Top-K Frequent Subgraphs

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation