Hierarchy-Debug: a scalable statistical technique for fault localization

Parsa, Saeed; Vahidi-Asl, Mojtaba; Asadi-Aghbolaghi, Maryam

doi:10.1007/s11219-013-9199-x

Hierarchy-Debug: a scalable statistical technique for fault localization

Published: 02 April 2013

Volume 22, pages 427–466, (2014)
Cite this article

Software Quality Journal Aims and scope Submit manuscript

Saeed Parsa¹,
Mojtaba Vahidi-Asl¹ &
Maryam Asadi-Aghbolaghi¹

443 Accesses
10 Citations
Explore all metrics

Abstract

Considering the fact that faults may be revealed as undesired mutual effect of program predicates on each other, a new approach for localizing latent bugs, namely Hierarchy-Debug, is presented in this paper. To analyze the vertical effect of predicates on each other and on program termination status, the predicates are fitted into a logistic lasso model. To support scalability, a hierarchical clustering algorithm is applied to cluster the predicates according to their presence in different executions. Considering each cluster as a pseudo-predicate, a distinct lasso model is built for intermediate levels of the hierarchy. Then, we apply a majority voting technique to score the predicates according to their lasso coefficients at different levels of the hierarchy. The predicates with relatively higher scores are ranked as fault relevant predicates. To provide the context of failure, faulty sub-paths are identified as sequences of fault relevant predicates. The grouping effect of Hierarchy-Debug helps programmers to detect multiple bugs. Four case studies have been designed to evaluate the proposed approach on three well-known test suites, Space, Siemens, and Bash. The evaluations show that Hierarchy-Debug produces more precise results compared with prior fault localization techniques on the subject programs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Inforence: effective fault localization based on information-theoretic analysis and statistical causal inference

Article 18 June 2019

FTFL: A Fisher’s test-based approach for fault localization

Article 17 June 2021

Iterative User-Driven Fault Localization

References

Abreu, R., Zoeteweij, P., Golsteijn, R., van Gemund, A., & Arjan, J. C. (2009). A practical evaluation of spectrum-based fault localization. Journal of Systems and Software, 82(11), 1780–1792.
Google Scholar
Arumuga Nainar, P., Chen, T., Rosin, J., & Liblit, B. (2007). Statistical debugging using compound Boolean predicates. In Proceedings of international symposium on software testing and analysis (pp. 5–15).
Arumuga Nainar, P., & Liblit, B. (2010). Adaptive bug isolation. In Proceedings of 32nd international conference on software engineering (pp. 255–264).
Chatterjee, S., Hadi, A., & Price, B. (2006). Regression analysis by example. New York: Wiley.
Book MATH Google Scholar
Cheng, H., Lo, D., Zhou, Y., & Wang, X. (2009). Identifying bug signatures using discriminative graph mining. In Proceedings of international symptoms on software testing and analysis (pp. 141–151).
Chilimbi, T. M., Liblit, B., Mehra, K., Nori, A. V., & Vaswani, K. (2009). HOLMES: Effective statistical debugging via efficient path profiling. In Proceedings of 31st international conference on software engineering (pp. 34–44).
Cleve, H., & Zeller, A. (2005). Locating causes of program failures. In Proceedings of the 27th international conference on software engineering (pp. 342–351).
Collofello, J. S., & Woodfield, S. N. (1989). Evaluating the effectiveness of reliability-assurance techniques. Journal of System and Software, 9(3), 191–195.
Article Google Scholar
Dickinson, W., Leon, D., & Podgurski, A. (2001). Finding failures by cluster analysis of execution profiles.In Proceedings of the 23rd international conference on software engineering (pp. 339–348).
Do, H., Elbaum, S., & Rothermel, G. (2005). Supporting controlled experimentation with testing techniques: An infrastructure and its potential impact. Empirical Software Engineering, 10(4), 405–435.
Article Google Scholar
Hangal, S., & Lam, M. (2002). Tracking down software bugs using automatic anomaly detection. In Proceedings of the 24th international conference software engineering (pp. 291–301).
Eisen, M., Spellman, P., Brown, P., & Botstein, D. (1998). Cluster analysis and display of genomewide expression patterns. Proceedings of the National Academy of Sciences of the United States of America, 95(25), 14863–14868.
Ernst, M. D., Cockrell, J., Griswold, G. W., & Notkin, D. (2001). Dynamically discovering likely program invariants to support program evolution. IEEE Transactions on Software Engineering, 27(2), 99–123.
Article Google Scholar
Friedman, J., Hastie, T., & Tibshirani, R. (2009). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1–22.
Google Scholar
Gupta, N., He, H., Zhang, X., & Gupta, R. (2008). Locating faulty code using failure-inducing chops. In Proceedings of the 20th IEEE/ACM international conference on automated software engineering (pp. 263–272).
Hastie, T. J., Tibshirani, R. J., & Friedman, J. (2009). The elements of statistical learning: Data mining inference and prediction (2nd ed.). New York: Springer.
Book Google Scholar
Hsu, H., Jones, J. A., & Orso, A. (2008). Rapid: Identifying bug signatures to support debugging activities. In Proceedings of the 23rd IEEE/ACM international conference on automated software engineering (pp. 439–442).
Jiang, L., & Su, Z. (2007). Context-aware statistical debugging: From bug predictors to faulty control flow paths. In Proceedings of twenty-second IEEE/ACM international conference on automated software engineering (pp. 184–193).
Jones, J. A., & Harrold, M. J. (2005). Evaluation of the tarantula automatic faultlocalization technique. In Proceedings of automated software engineering (pp. 273–282).
Liblit, B. (2004). Cooperative bug isolation. PhD thesis, University of California, Berkeley, Springer.
Liblit, B., Aiken, A., Zheng, X., & Jordan, M.I. (2003). Bug isolation via remote program sampling. In Proceedings of the ACM SIGPLAN 2003 conference on programming language design and implementation (pp. 141–154).
Liblit, B., Naik, M., Zheng, A., Jordan, M., & Aiken, A. (2005). Scalable statistical bug isolation. In Proceedings of international conference on programming language design and implementation (pp. 15–26).
Liu, C., Yan,X., Fei, L., & Midkiff, S. P. (2005). Sober: Statistical model-based bug localization. In Proceedings of 10th European software engineering conference/13th ACM SIGSOFT international symposium foundations of software engineering (pp. 286–295).
Park, M., Hastie, T., & Tibshirani, R. (2007). Averaged gene expressions for regression. Biostatistics Journal, 8(2), 212–227.
Google Scholar
Parsa, S., Arabi, S., Vahidi-Asl, M., & Minaei-Bidgoli, B. (2009a). Statistical software debugging: From bug predictors to the main causes of failure. Special session on software metrics and measurement in conjunction with “The second international conference on application of digital information and web technologies” (pp. 802–807).
Parsa, S., Arabi, S., Vahidi-Asl, M., & Minaei-Bidgoli, B. (2009b). Software fault localization using elastic net: A new statistical approach, Communications in Computer and Information Science, 59, 127–134.
Google Scholar
Parsa, S., Asadi-Aghbolaghi, M., & Vahidi-Asl, M. (2011). Statistical debugging using a hierarchical model of correlated predicates. Lecture Notes in Computer Science (Vol. 7002, pp. 251–256). Springer.
Parsa, S., Vahidi-Asl, M., & Arabi, S. (2008). Finding causes of software failure using ridge regression and association rule generation methods. In Proceedings of ninth ACIS international conference on parallel/distributed computing (pp. 873–878).
Podgurski, A., Leon, D., Francis, P., Masri, W., Minch, M., Sun, J., & Wang, B. (2003). Automated support for classifying software failure reports. In Proceedings of the 25th international conference on software engineering (pp. 465–475).
Pytlik, B., Renieris, M., Krishnamurthi, S., & Reiss, S.(2003). Automated fault localization using potential invariants. In Proceedings of the fifth international workshop automated and algorithmic debugging (pp. 273–276).
Renieris, M., & Reiss, S. (2003). Fault localization with nearest neighbor queries. In Proceedings of 18th IEEE international conference on automated software engineering, Montreal (pp. 30–39).
Tibshirani, R. (1994). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58, 267–288.
Google Scholar
Tibshirani, R. (1996). Optimal reinsertion: Regression shrinkage and selection via the lasso. Royal Statistical Society, 58, 267–288.
MATH MathSciNet Google Scholar
Vessey, L. (1985). Expertise in debugging computer programs: A process analysis. In Proceedings of the International Journal of Man–Machine Studies Expertise in debugging computer programs, 23(5), 459–494.
Vokolos, F., & Frankl, P. (1998). Empirical evaluation of the textual differencing regression testing techniques. In Proceedings of the international conference on software maintenance (p. 44).
Zeller, A. (2002) Isolating cause–effect chains from computer programs. In Proceedings of ACM international symposium on foundations of software engineering (pp. 1–10).
Zeller, A. (2006). Why programs fail: A guide to systematic debugging. Burlington: Morgan Kaufmann.
Zeller, A., & Hildebrandt, R. (2002). Simplifying and isolating failure-inducing input. IEEE Transactions on Software Engineering, 28(2), 183–200.
Article Google Scholar
Zhang, Z., Chan, W. K., Tse, T. H., Hu, P., & Wang, X. (2009). Is non-parametric hypothesis testing model robust for statistical fault localization? Journal of Information and Software Technology, 51, 1573–1585.
Google Scholar
Zhang, Z., Chan, W. K., Tse, T. H., Yu, Y. T., & Hu, P. (2011). Non-parametric statistical fault localization. Journal of System and Software, 84(6), 885–905.
Google Scholar
Zhang, X., Gupta, N., & Gupta, R. (2006a). Locating faults through automated predicate switching. In Proceedings of the 28th international conference on Software engineering (pp. 272–281).
Zhang, X., Gupta, N., & Gupta, R. (2006b). Pruning dynamic slices with confidence. SIGPLAN notices (Vol. 41, No. 6, pp. 169–180). ACM Press.
Zhang, X.,Gupta, N., & Gupta, R. (2006c). Prunning dynamic slices with confidence. In Proceedings of ACM SIGPLAN conference on programming language design and implementation (pp.169–180).
Zhang, X., Gupta, N., & Gupta, R. (2007). Locating faulty code by multiple points slicing. Software: Practice and Experience, 37(9), 935–961.
Google Scholar
Zhang, X., Gupta, R., & Zhang, Y. (2003). Precise dynamic slicing algorithms. In Proceedings of IEEE/ACM international conference on software engineering (pp. 319–329).
Zheng, A. X., Jordan, M. I.,Liblit, B., Naik, M., & Aiken, A. (2006). Statistical debugging: Simultaneous identification of multiple bug. In Proceedings of the 23rd international conference on machine learning (pp. 1105–1112).
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B, 67, 301–320.
Google Scholar

Download references

Acknowledgments

The authors would like to thank G. Rothermel for making the Siemens test suite available; W. Motycka for his invaluable help and support on the execution of SIR programs; B. Liblit for his insightful comments. But above all, the authors deeply appreciate the insightful questions, comments, and recommendation from the journal’s Editor-in-Chief, editor, and anonymous referees during the preparation of this paper.

Author information

Authors and Affiliations

Institute of Computer Engineering, Iran University of Science and Technology, Narmak, Tehran, Iran
Saeed Parsa, Mojtaba Vahidi-Asl & Maryam Asadi-Aghbolaghi

Authors

Saeed Parsa
View author publications
You can also search for this author in PubMed Google Scholar
Mojtaba Vahidi-Asl
View author publications
You can also search for this author in PubMed Google Scholar
Maryam Asadi-Aghbolaghi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mojtaba Vahidi-Asl.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Parsa, S., Vahidi-Asl, M. & Asadi-Aghbolaghi, M. Hierarchy-Debug: a scalable statistical technique for fault localization. Software Qual J 22, 427–466 (2014). https://doi.org/10.1007/s11219-013-9199-x

Download citation

Published: 02 April 2013
Issue Date: September 2014
DOI: https://doi.org/10.1007/s11219-013-9199-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hierarchy-Debug: a scalable statistical technique for fault localization

Abstract

Access this article

Similar content being viewed by others

Inforence: effective fault localization based on information-theoretic analysis and statistical causal inference

FTFL: A Fisher’s test-based approach for fault localization

Iterative User-Driven Fault Localization

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Hierarchy-Debug: a scalable statistical technique for fault localization

Abstract

Access this article

Similar content being viewed by others

Inforence: effective fault localization based on information-theoretic analysis and statistical causal inference

FTFL: A Fisher’s test-based approach for fault localization

Iterative User-Driven Fault Localization

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation