Guest editorial: mining software repositories

Pinzger, Martin; Kim, Sunghun

doi:10.1007/s10664-016-9450-8

Guest editorial: mining software repositories

Editorial
Published: 09 August 2016

Volume 21, pages 2033–2034, (2016)
Cite this article

Download PDF

Empirical Software Engineering Aims and scope Submit manuscript

Guest editorial: mining software repositories

Download PDF

Martin Pinzger¹ &
Sunghun Kim²

2476 Accesses
2 Citations
1 Altmetric
Explore all metrics

The Mining Software Repositories (MSR) field analyzes the rich data available in software repositories to uncover interesting and actionable information about software systems and projects. Thanks to the ready availability of software configuration management, mailing list, and bug tracking repositories from open source projects, it has gained popularity since 2004 and continues to be one of the fastest growing fields in the area of software engineering. Researchers in this field empirically explore a range of software engineering questions using software repository data as the primary source of information. Some commonly explored areas include software evolution, models of software development processes, characterization of developers and their activities, prediction of future software qualities, use of machine learning techniques on software project data, software bug prediction, analysis of software change patterns, and analysis of code clones. This special issue provides five recent MSR papers, that are briefly discussed as follows.

The paper “An In-Depth Study of the Promises and Perils of Mining GitHub” by Kalliamvakou, Gousios, Blincoe, Damian, Singer, and German reports the characteristics of the repositories and users on GitHub including how users take advantage of GitHub’s main features and how their activities are tracked on GitHub and related datasets to point out misalignments between the real and mined data. The results indicate that while GitHub provides a rich source of data on software development, mining GitHub for research purposes should take various potential perils into account.

In the paper “Studying Just-In-Time Defect Prediction Using Cross-Project Models” by Kamei, Fukushima, McIntosh, Yamashita, Ubayashi, and Hassan, the cold start problem for Just-In-Time (JIT) defect prediction using cross-project data is addressed. Through an empirical study with eleven open source projects the authors find that the performance of defect prediction models can be improved by combining the data of several projects to form a larger pool of training data and by selecting projects that are similar to the testing project.

Also on the topic of defect prediction, the paper “Towards Building a Universal Defect Prediction Model with Rank Transformed Predictors”, by Zhang, Mockus, Keivanloo, and Zou proposes a universal defect prediction model by using the transformed data of 1,385 open source projects from SourceForge and GoogleCode. This universal model permits users to predict defects within and across projects with an accuracy comparable to within-project prediction models.

In the paper “An Empirical Study of the Impact of Modern Code Review Practices on Software Quality”, the authors McIntosh, Kamei, Adams, and Hassan present an empirical study of code review practices and found that code review coverage, participation, and expertise share a significant link with software quality. These findings clearly indicate that poorly-reviewed code has a negative impact on software quality in large software systems.

Finally, the paper “Prompter: Turning the IDE into a Self-confident Programming Assistant” by Ponzanelli, Bavota, Di Penta, Oliveto, and Lanza proposes a system to automatically provide Stack Overflow discussions based on the current context in an Integrated Development Environment (IDE). The results of the evaluation with several participants showed the approach is effective in helping developers to improve the correctness of their tasks, but there are issues with the volatility of the recommendations.

Acknowledgments

We are grateful for the continuous support and encouragement offered by the editorial board for the Journal of Empirical Software Engineering and by the Editor-in-Chief Lionel Briand and Thomas Zimmermann. We also thank the authors for keeping up with the review schedule and the reviewersfor their detailed and constructive comments which helped to shape the papers.

Author information

Authors and Affiliations

Software Engineering Research Group, Alpen-Adria Universität Klagenfurt, Klagenfurt, Austria
Martin Pinzger
Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, Hong Kong
Sunghun Kim

Authors

Martin Pinzger
View author publications
You can also search for this author in PubMed Google Scholar
Sunghun Kim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Martin Pinzger.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pinzger, M., Kim, S. Guest editorial: mining software repositories. Empir Software Eng 21, 2033–2034 (2016). https://doi.org/10.1007/s10664-016-9450-8

Download citation

Published: 09 August 2016
Issue Date: October 2016
DOI: https://doi.org/10.1007/s10664-016-9450-8

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation