Empirical Software Engineering

, Volume 21, Issue 2, pp 337–367

Improving bug management using correlations in crash reports

Article

DOI: 10.1007/s10664-014-9333-9

Cite this article as:
Wang, S., Khomh, F. & Zou, Y. Empir Software Eng (2016) 21: 337. doi:10.1007/s10664-014-9333-9
  • 242 Downloads

Abstract

Nowadays, many software organizations rely on automatic problem reporting tools to collect crash reports directly from users’ environments. These crash reports are later grouped together into crash types. Usually, developers prioritize crash types based on the number of crash reports and file bug reports for the top crash types. Because a bug can trigger a crash in different usage scenarios, different crash types are sometimes related to the same bug. Two bugs are correlated when the occurrence of one bug causes the other bug to occur. We refer to a group of crash types related to identical or correlated bug reports, as a crash correlation group. In this paper, we propose five rules to identify correlated crash types automatically. We propose an algorithm to locate and rank buggy files using crash correlation groups. We also propose a method to identify duplicate and related bug reports. Through an empirical study on Firefox and Eclipse, we show that the first three rules can identify crash correlation groups using stack trace information, with a precision of 91 % and a recall of 87 % for Firefox and a precision of 76 % and a recall of 61 % for Eclipse. On the top three buggy file candidates, the proposed bug localization algorithm achieves a recall of 62 % and a precision of 42 % for Firefox, and a recall of 52 % and a precision of 50 % for Eclipse. On the top 10 buggy file candidates, the recall increases to 92 % for Firefox and 90 % for Eclipse. The proposed duplicate bug report identification method achieves a recall of 50 % and a precision of 55 % on Firefox, and a recall of 47 % and a precision of 35 % on Eclipse. Developers can combine the proposed crash correlation rules with the new bug localization algorithm to identify and fix correlated crash types all together. Triagers can use the duplicate bug report identification method to reduce their workload by filtering duplicate bug reports automatically.

Keywords

Crashes Crash reports Stack traces Bug localization Bug duplication 

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.School of ComputingQueen’s UniversityKingstonCanada
  2. 2.SWAT Lab, DGIGL, Polytechnique MontréalMontréalCanada
  3. 3.Electrical and Computer EngineeringQueen’s UniversityKingstonCanada

Personalised recommendations