Mining for Paths in Flow Graphs

  • Adam Jocksch
  • José Nelson Amaral
  • Marcel Mitran
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6171)


This paper presents FlowGSP, a data-mining algorithm that discovers frequent sequences of attributes in subpaths of a flow graph. FlowGSP was evaluated using flow graphs derived from the execution of transactions in the IBM® WebSphere® Application Server, a large real-world enterprise application server. The vertices of this flow graph may represent single instructions, bytecodes, basic blocks, regions, or entire methods. These vertices are annotated with attributes that correspond to run-time characteristics of the execution of the program. FlowGSP successfully identified a number of existing characteristics of the WebSphere Application Server which had previously been discovered only through extensive manual examination. In addition, a multi-threaded implementation of FlowGSP demonstrates the algorithm’s suitability for exploiting the resources of modern multi-core computers.


Data mining Flow Graphs Compiler Implementation Hardware Performance Monitors 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: SIGMOD International Conference on Management of Data, Washington, DC, USA, pp. 207–216 (1993)Google Scholar
  2. 2.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: International Conference on Very Large Data Bases (VLDB), Santiago, Chile, September 1994, pp. 487–499 (1994)Google Scholar
  3. 3.
    Agrawal, R., Srikant, R.: Mining sequential patterns. In: International Conference on Data Engineering (ICDE), Taipei, Taiwan, March 1995, pp. 3–14 (1995)Google Scholar
  4. 4.
    Ball, T., Mataga, P., Sagiv, M.: Edge profiling versus path profiling: the showdown. In: Principles of Programming Languages (POPL), San Diego, California, United States, pp. 134–148 (1998)Google Scholar
  5. 5.
    Geng, R., Dong, X., Zhang, X., Xu, W.: Efficiently mining closed frequent patterns with weight constraint from directed graph traversals using weighted FP-tree approach. In: Intern. Coll. on Computing, Communication, Control, and Management, Guangzhou City, China, August 2008, pp. 399–403 (2008)Google Scholar
  6. 6.
    Grcevski, N., Kielstra, A., Stoodley, K., Stoodley, M., Sundaresan, V.: Java just-in-time compiler and virtual machine improvements for server and middleware applications. In: Conf. on Virtual Machine Research and Technology Symposium (VM), San Jose, CA, USA, p. 12. USENIX Assoc. (2004)Google Scholar
  7. 7.
    Hasan, M.A., Chaoji, V., Salem, S., Besson, J., Zaki, M.: Origami: Mining representative orthogonal graph patterns. In: International Conference on Data Mining (ICDM), Washington, DC, USA, pp. 153–162 (2007)Google Scholar
  8. 8.
    Hwang, C.-C., Huang, S.-K., Chen, D.-J., Chen, D.T.K.: Object-oriented program behavior analysis based on control patterns. In: Asia-Pacific Conf. on Quality Software, Hong Kong, China, December 2001, pp. 81–87 (2001)Google Scholar
  9. 9.
    IBM Corporation. WebSphere Application Server (March 2009),
  10. 10.
    Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  11. 11.
    Inokuchi, A., Washio, T., Motoda, H., Kumasawa, K., Arai, N.: Basket analysis for graph structured data. In: Zhong, N., Zhou, L. (eds.) PAKDD 1999. LNCS (LNAI), vol. 1574, pp. 420–431. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  12. 12.
    Jocksch, A., Mitran, M., Siu, J., Grcevski, N., Amaral, J.N.: Mining opportuinities for code improvement in a just-in-time compiler. In: Compiler Construction (CC), Paphos, Cyprus (March 2010)Google Scholar
  13. 13.
    Lee, S.D., Park, H.C.: Mining frequent patterns from weighted traversals on graph using confidence interval and pattern priority. Intern. Journal of Computer Science and Network Security 6(5A), 136–141 (2006)Google Scholar
  14. 14.
    Mannila, H., Toivonen, H., Verkamo, A.I.: Discovering Frequent Episodes in Sequences. In: Fayyad, U.M., Uthurusamy, R. (eds.) Knowledge Discovery and Data Mining (KDD), Montreal, Canada (1995)Google Scholar
  15. 15.
    Moseley, T., Grunwald, D., Peri, R.V.: Optiscope: Performance accountability for optimizing compilers. In: Code Generation and Optimization (CGO), Seattle, WA, USA (2009)Google Scholar
  16. 16.
    Nagpurkar, P., Cain, H.W., Serrano, M., Choi, J.-D., Krintz, R.: A study of instruction cache performance and the potential for instruction prefetching in J2EE server applications. In: Workshop of Computer Architecture Evaluation using Commercial Workloads, Phoenix, AZ, USA (2007)Google Scholar
  17. 17.
    Nijssen, S., Kok, J.N.: A quickstart in frequent structure mining can make a difference. In: Knowledge Discovery and Data Mining (KDD), Seattle, WA, USA, pp. 647–652 (2004)Google Scholar
  18. 18.
    Pawlak, Z.: Flow graphs and data mining. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 1–36. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  19. 19.
    Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.C.: PrefixSpan mining sequential patterns efficiently by prefix projected pattern growth. In: International Conference on Data Engineering (ICDE), Heidelberg, Germany, pp. 215–226 (2001)Google Scholar
  20. 20.
    Srikant, R., Agrawal, R.: Mining Sequential Patterns: Generalizations and Performance Improvements. In: Advances in Database Techn., pp. 3–17. Springer, Heidelberg (1996)Google Scholar
  21. 21.
    Webb, C.F.: IBM z10: The next generation microprocessor. IEEE Micro 28(2), 19–29 (2008)CrossRefGoogle Scholar
  22. 22.
    Yamamoto, T., Ozaki, T., Ohkawa, T.: Discovery of Frequent Graph Patterns that Consist of the Vertices with the Complex Structures. LNCS, pp. 143–156. Springer, Heidelberg (2008)Google Scholar
  23. 23.
    Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining. In: International Conference on Data Mining (ICDM), Washington, DC, USA, p. 721 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Adam Jocksch
    • 1
  • José Nelson Amaral
    • 2
  • Marcel Mitran
    • 3
  1. 1.Research in MotionWaterlooCanada
  2. 2.Department of Computing ScienceUniversity of AlbertaEdmontonCanada
  3. 3.IBM Toronto Software LaboratoryTorontoCanada

Personalised recommendations