Mining Edge-Weighted Call Graphs to Localise Software Bugs

  • Frank Eichinger
  • Klemens Böhm
  • Matthias Huber
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5211)


An important problem in software engineering is the automated discovery of noncrashing occasional bugs. In this work we address this problem and show that mining of weighted call graphs of program executions is a promising technique. We mine weighted graphs with a combination of structural and numerical techniques. More specifically, we propose a novel reduction technique for call graphs which introduces edge weights. Then we present an analysis technique for such weighted call graphs based on graph mining and on traditional feature selection schemes. The technique generalises previous graph mining approaches as it allows for an analysis of weights. Our evaluation shows that our approach finds bugs which previous approaches cannot detect so far. Our technique also doubles the precision of finding bugs which existing techniques can already localise in principle.


Edge Weight Program Execution Call Frequency Call Graph Graph Mining 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Liu, C., Yan, X., Yu, H., Han, J., Yu, P.S.: Mining Behavior Graphs for “Backtrace” of Noncrashing Bugs. In: Proc. of the 5th Int. Conf. on Data Mining (SDM) (2005)Google Scholar
  2. 2.
    Di Fatta, G., Leue, S., Stegantova, E.: Discriminative Pattern Mining in Software Fault Detection. In: Proc. of the 3rd Int. Workshop on Software Quality Assurance (SOQUA) (2006)Google Scholar
  3. 3.
    Borgelt, C., Berthold, M.R.: Mining Molecular Fragments: Finding Relevant Substructures of Molecules. In: Proc. of the 2nd Int. Conf. on Data Mining (ICDM) (2002)Google Scholar
  4. 4.
    Yan, X., Han, J.: gSpan: Graph-Based Substructure Pattern Mining. In: Proc. of the 2nd Int. Conf. on Data Mining (ICDM) (2002)Google Scholar
  5. 5.
    Yan, X., Han, J.: CloseGraph: Mining Closed Frequent Graph Patterns. In: Proc. of the 9th Int. Conf. on Knowledge Discovery and Data Mining (KDD) (2003)Google Scholar
  6. 6.
    Nijssen, S., Kok, J.N.: A Quickstart in Frequent Structure Mining Can Make a Difference. In: Proc. of the 10th Int. Conf. on Knowledge Discovery and Data Mining (KDD) (2004)Google Scholar
  7. 7.
    Harrold, M.J., Gupta, R., Soffa, M.L.: A Methodology for Controlling the Size of a Test Suite. ACM Transactions on Software Engineering and Methodology (TOSEM) 2(3), 270–285 (1993)CrossRefGoogle Scholar
  8. 8.
    Nagappan, N., Ball, T., Zeller, A.: Mining Metrics to Predict Component Failures. In: Proc. of the 28th Int. Conf. on Software Engineering (ICSE) (2006)Google Scholar
  9. 9.
    Knab, P., Pinzger, M., Bernstein, A.: Predicting Defect Densities in Source Code Files with Decision Tree Learners. In: Proc. of the Int. Workshop on Mining Software Repositories (MSR) at ICSE (2006)Google Scholar
  10. 10.
    Schröter, A., Zimmermann, T., Zeller, A.: Predicting Component Failures at Design Time. In: Proc. of the 5th Int. Symposium on Empirical Software Engineering (2006)Google Scholar
  11. 11.
    Korel, B., Laski, J.: Dynamic Program Slicing. Information Processing Letters 29(3), 155–163 (1988)zbMATHCrossRefGoogle Scholar
  12. 12.
    Liblit, B., Aiken, A., Zheng, A.X., Jordan, M.I.: Bug Isolation via Remote Program Sampling. ACM SIGPLAN Notices 38(5), 141–154 (2003)CrossRefGoogle Scholar
  13. 13.
    Liu, C., Yan, X., Han, J.: Mining Control Flow Abnormality for Logic Error Isolation. In: Proc. of the 6th Int. Conf. on Data Mining (SDM) (2006)Google Scholar
  14. 14.
    Eagan, J., Harrold, M.J., Jones, J.A., Stasko, J.: Technical Note: Visually Encoding Program Test Information to Find Faults in Software. In: Proc. of the Symposium on Information Visualization (INFOVIS) (2001)Google Scholar
  15. 15.
    Hutchins, M., Foster, H., Goradia, T., Ostrand, T.: Experiments on the Effectiveness of Dataflow- and Controlflow-Based Test Adequacy Criteria. In: Proc. of the 16th Int. Conf. on Software Engineering (ICSE) (1994)Google Scholar
  16. 16.
    Asai, T., Abe, K., Kawasoe, S., Arimura, H., Sakamoto, H., Arikawa, S.: Efficient Substructure Discovery from Large Semi-structured Data. In: Proc. of the 2nd Int. Conf. on Data Mining (SDM) (2002)Google Scholar
  17. 17.
    Han, J., Cheng, H., Xin, D., Yan, X.: Frequent Pattern Mining: Current Status and Future Directions. Data Mining and Knowledge Discovery 15(1), 55–86 (2007)CrossRefMathSciNetGoogle Scholar
  18. 18.
    Srikant, R., Agrawal, R.: Mining Quantitative Association Rules in Large Relational Tables. In: Proc. of the Int. Conf. on Management of Data (SIGMOD) (1996)Google Scholar
  19. 19.
    Jiang, W., Vaidya, J., Balaporia, Z., Clifton, C., Banich, B.: Knowledge Discovery from Transportation Network Data. In: Proc. of the 21st Int. Conf. on Data Engineering (ICDE) (2005)Google Scholar
  20. 20.
    Nowozin, S., Tsuda, K., Uno, T., Kudo, T., Bakir, G.: Weighted Substructure Mining for Image Analysis. In: Proc. of the Conf. on Computer Vision and Pattern Recognition (CVPR) (2007)Google Scholar
  21. 21.
    Eichinger, F., Nauck, D.D., Klawonn, F.: Sequence Mining for Customer Behaviour Predictions in Telecommunications. In: Proc. of the Workshop on Practical Data Mining at ECML/PKDD (2006)Google Scholar
  22. 22.
    Eichinger, F., Böhm, K., Huber, M.: Improved Software Fault Detection with Graph Mining. In: Proceedings of the 6th Int.Workshop on Mining and Learning with Graphs (MLG) at ICML (2008)Google Scholar
  23. 23.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (2005)Google Scholar
  24. 24.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)Google Scholar
  25. 25.
    Darwin, I.F.: Java Cookbook. O’Reilly, Sebastopol (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Frank Eichinger
    • 1
  • Klemens Böhm
    • 1
  • Matthias Huber
    • 1
  1. 1.Institute for Program Structures and Data Organisation (IPD)Universität Karlsruhe (TH)Germany

Personalised recommendations