Mining Cohesive Itemsets in Graphs

Hendrickx, Tayena; Cule, Boris; Goethals, Bart

doi:10.1007/978-3-319-11812-3_10

Tayena Hendrickx²¹,
Boris Cule²¹ &
Bart Goethals²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8777))

Included in the following conference series:

International Conference on Discovery Science

1917 Accesses
3 Citations

Abstract

Discovering patterns in graphs is a well-studied field of data mining. While a lot of work has already gone into finding structural patterns in graph datasets, we focus on relaxing the structural requirements in order to find items that often occur near each other in the input graph. By doing this, we significantly reduce the search space and simplify the output. We look for itemsets that are both frequent and cohesive, which enables us to use the anti-monotonicity property of the frequency measure to speed up our algorithm. We experimentally demonstrate that our method can handle larger and more complex datasets than the existing methods that either run out of memory or take too long.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abdulrehman, D., Monteiro, P.T., Teixeira, M.C., Mira, N.P., Lourenço, A.B., dos Santos, S.C., Cabrito, T.R., Francisco, A.P., Madeira, S.C., Aires, R.S., Oliveira, A.L., Sá-Correia, I., Freitas, A.T.: YEASTRACT: providing a programmatic access to curated transcriptional regulatory associations in Saccharomyces cerevisiae through a web services interface. Nucleic Acids Research 39 (2011)
Google Scholar
Bringmann, B., Nijssen, S.: What is frequent in a single graph? In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 858–863. Springer, Heidelberg (2008)
Chapter Google Scholar
Cherry, J.: SGD: Saccharomyces Genome Database. Nucleic Acids Research 26(1), 73–79 (1998)
Article Google Scholar
Cook, D.J., Holder, L.B.: Substructure discovery using minimum description length and background knowledge. Journal of Artificial Intelligence Research 1, 231–255 (1994)
Google Scholar
Cule, B., Goethals, B., Hendrickx, T.: Mining interesting itemsets in graph datasets. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013, Part I. LNCS, vol. 7818, pp. 237–248. Springer, Heidelberg (2013)
Chapter Google Scholar
Dehaspe, L., Toivonen, H.: Discovery of frequent datalog patterns. Data Mining and Knowledge Discovery 3, 7–36 (1999)
Article Google Scholar
Guan, Z., Wu, J., Zhang, Q., Singh, A., Yan, X.: Assessing and ranking structural correlations in graphs. In: Proc. of the 2011 ACM SIGMOD Int. Conf. on Management of Data, pp. 937–948 (2011)
Google Scholar
Huan, J., Wang, W., Prins, J., Yang, J.: Spin: mining maximal frequent subgraphs from graph databases. In: Proc. of the 10th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 581–586 (2004)
Google Scholar
Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000)
Chapter Google Scholar
Inokuchi, A., Washio, T., Motoda, H.: Complete mining of frequent patterns from graphs: Mining graph data. Machine Learning 50, 321–354 (2003)
Article MATH Google Scholar
Karunaratne, T., Boström, H.: Can frequent itemset mining be efficiently and effectively used for learning from graph data? In: Proc. of the 11th Int. Conf. on Machine Learning and Applications (ICMLA), pp. 409–414 (2012)
Google Scholar
Khan, A., Yan, X., Wu, K.L.: Towards proximity pattern mining in large graphs. In: Proc. of the 2010 ACM SIGMOD Int. Conf. on Management of Data, pp. 867–878 (2010)
Google Scholar
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Proc. of the 2001 IEEE Int. Conf. on Data Mining, pp. 313–320 (2001)
Google Scholar
Kuramochi, M., Karypis, G.: Finding frequent patterns in a large sparse graph. Data Mining and Knowledge Discovery 11, 243–271 (2005)
Article MathSciNet Google Scholar
Nijssen, S., Kok, J.: The gaston tool for frequent subgraph mining. Electronic Notes in Theoretical Computer Science 127, 77–87 (2005)
Article Google Scholar
Silva, A., Meira, J.W., Zaki, M.J.: Structural correlation pattern mining for large graphs. In: Proc. of the 8th Workshop on Mining and Learning with Graphs, pp. 119–126 (2010)
Google Scholar
Silva, A., Meira, J.W., Zaki, M.J.: Mining attribute-structure correlated patterns in large attributed graphs. Proc. of the VLDB Endowment 5(5), 466–477 (2012)
Article Google Scholar
The Gene Ontology Consortium: Gene Ontology annotations and resources. Nucleic Acids Research 41(Database issue), D530–D535 (2013)
Google Scholar
Washio, T., Motoda, H.: State of the art of graph-based data mining. ACM SIGKDD Explorations Newsletter 5, 59–68 (2003)
Article Google Scholar
Yan, X., Han, J.: gspan: Graph-based substructure pattern mining. In: Proc. of the 2002 IEEE Int. Conf. on Data Mining, pp. 721–724 (2002)
Google Scholar
Yan, X., Han, J.: Closegraph: Mining closed frequent graph patterns. In: Proc. of the 9th ACM SIGKDD Int. Conf. on Knowledge Discovery in Data Mining, pp. 286–295 (2003)
Google Scholar
Yan, X., Zhou, X., Han, J.: Mining closed relational graphs with connectivity constraints. In: Proc. of the 11th ACM SIGKDD Int. Conf. on Knowledge Discovery in Data Mining, pp. 324–333 (2005)
Google Scholar
Yoshida, K., Motoda, H., Indurkhya, N.: Graph-based induction as a unified learning framework. Journal of Applied Intelligence 4, 297–316 (1994)
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Antwerp, Belguim
Tayena Hendrickx, Boris Cule & Bart Goethals

Authors

Tayena Hendrickx
View author publications
You can also search for this author in PubMed Google Scholar
Boris Cule
View author publications
You can also search for this author in PubMed Google Scholar
Bart Goethals
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Knowledge Technologies, Jožef Stefan Institute, Jamova cesta 39, 1000, Ljubljana, Slovenia
Sašo Džeroski , Panče Panov & Dragi Kocev , &
Faculty of Administration, University of Ljubljana, Gosarjeva 5, 1000, Ljubljana, Slovenia
Ljupčo Todorovski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hendrickx, T., Cule, B., Goethals, B. (2014). Mining Cohesive Itemsets in Graphs. In: Džeroski, S., Panov, P., Kocev, D., Todorovski, L. (eds) Discovery Science. DS 2014. Lecture Notes in Computer Science(), vol 8777. Springer, Cham. https://doi.org/10.1007/978-3-319-11812-3_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-11812-3_10
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11811-6
Online ISBN: 978-3-319-11812-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics