Skip to main content
Log in

FENSE: A feature-based ensemble modeling approach to cross-project just-in-time defect prediction

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Context:

Just-in-time defect prediction (JITDP) leverages modern machine learning models to predict the defect-proneness of commits. Such models require adequate training data, which is unavailable in projects with short histories. To address this problem, cross-project methods reuse the data or models in other projects to make predictions, grounded on the assumption that they share similar defect-related features. However, these features are overlooked, which leads to unsatisfying model performance.

Objective:

This study aims to investigate the relationship between cross-project JITDP performances and project features, thereby improving the performance of cross-project models.

Method:

We propose a F eature-based ENSE mble modeling approach (FENSE) to cross-project JITDP. For a target project, FENSE pairs it to each source project and obtains 20 features. Leveraging them, it can predict the transferability of each off-the-shelf JITDP model. Then FENSE identifies the most transferable ones and combines them to make cross-project predictions. To achieve this, we conduct a large-scale empirical study of 113,906 project pairs in GitHub and investigate the impact of project features.

Results:

The results show that: (1) cross-project transferability is highly related to features including programming language and the defect ratio of the source project; (2) our feature-based model selection scheme can improve the cross-project JITDP performance by 10%; (3) FENSE outperforms other models on five evaluation measures without extra time and space costs.

Conclusions:

Our study suggests that project features can help identify powerful cross-project JITDP models and improve the performance of ensemble approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. https://www.nltk.org/

  2. http://libraries.io/

  3. https://guides.GitHub.com/features/issues/

  4. https://GitHub.blog/2013-01-22-closing-issues-via-commit-messages/

  5. https://github.com/scrapy/scrapy/

References

Download references

Acknowledgments

This work is supported by National Key R&D Program of China (2020AAA0103504), National Natural Science Foundation of China (No.61872373), and the Major Key Project of PCL.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yue Yu or Xinjun Mao.

Additional information

Communicated by: Markus Borg

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, T., Yu, Y., Mao, X. et al. FENSE: A feature-based ensemble modeling approach to cross-project just-in-time defect prediction. Empir Software Eng 27, 162 (2022). https://doi.org/10.1007/s10664-022-10185-8

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-022-10185-8

Keywords

Navigation