Abstract
In this paper, we present the results of a manual assessment on the coherence between the comments and the implementation of 3636 methods in three open source software applications (for one of these applications, we considered two different subsequent versions) implemented in Java. The results of this assessment have been collected in a dataset we made publicly available on the Web. The creation of this dataset is based on a protocol that is detailed in this paper. We present that protocol to let researchers evaluate the goodness of our dataset and to ease its future possible extensions. Another contribution of this paper consists in preliminarily investigating on the effectiveness of adopting a Vector Space Model (VSM) with the tf-idf schema to discriminate coherent and non-coherent methods. We observed that the lexical similarity alone is not sufficient for this distinction, while encouraging results have been obtained by applying an Support Vector Machine (SVM) classifier on the whole vector space.
Similar content being viewed by others
Notes
1 The comment right before/after of the definition of a method, a class, abstract class and so on.
3 In our case, an annotator is a person that produces annotations to software associating coherence information to methods.
10 Such approach is usually referred to as macroaveraging (Manning et al. 2008).
References
Antoniol, G., Canfora, G., Casazza, G., & De Lucia, A. (2000). Information retrieval models for recovering traceability links between code and documentation. In Proceedings of the international conference on software maintenance (pp. 40–51): IEEE Computer Society.
Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13, 281–305.
Binkley, D., Lawrie, D., Pollock, L., Hill, E., & Vijay-Shanker, K. (2013). A dataset for evaluating identifier splitters, IEEE Computer Society.
Bishop, C. M. (2006). Pattern recognition and machine learning (information science and statistics), Springer-Verlag New York, Inc., Secaucus.
Campbell, I., & Yiming, Y. (2011). Learning with support vector machines, Morgan and Claypool.
Caprile, B., & Tonella, P. (2000). Restructuring program identifier names. In Proceedings of international conference on software maintenance (pp. 97–107): IEEE Computer Society.
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46.
Cohen, J. (1968). Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 70(4), 213–220.
Corazza, A., Di Martino, S., & Maggio, V. (2012). LINSEN: an efficient approach to split identifiers and expand abbreviations. In Proceedings of international conference on software maintenance (pp. 233–242): IEEE Computer Society.
Corazza, A., Di Martino, S., Maggio, V., & Scanniello, G. (2011). Investigating the use of lexical information for software system clustering. In Proceedings of European conference on software maintenance and reengineering (pp. 35–44): IEEE Computer Society.
Corazza, A., Maggio, V., & Scanniello, G. (2015). On the coherence between comments and implementations in source code. In Proceedings of EUROMICRO conference on software engineering and advanced applications (pp. 76–83): IEEE Computer Society.
de Souza, S. C. B., Anquetil, N., & de Oliveira, K. M. (2005). A study of the documentation essential to software maintenance. In Proceedings of the international conference on design of communication: documenting & designing for pervasive information (pp. 68–75): ACM.
DeLine, R., Khella, A., Czerwinski, M., & Robertson, G. (2005). Towards understanding programs through wear-based filtering. In Proceedings of the 2005 ACM symposium on Software visualization, SoftVis ’05 (pp. 183–192): ACM.
Dit, B., Revelle, M., Gethers, M., & Poshyvanyk, D. (2013). Feature location in source code: a taxonomy and survey. Journal of Software: Evolution and Process, 25 (1), 53–95.
Fluri, B., Wursch, M., & Gall, H. (2007). Do code and comments co-evolve? on the relation between source code and comment changes. In Proceedings of the working conference on reverse engineering (pp. 70–79): IEEE Computer Society.
Fowler, M. (1999). Refactoring: improving the design of existing code. Boston: Addison-Wesley Longman Publishing Co., Inc.
Freund, R. J., & Wilson, W. J. (2003). Statistical methods, 2nd edn. Academic Press.
Jiang, Z. M., & Hassan, A. E. (2006). Examining the evolution of code comments in postgresql. In Diehl, S., Gall, H., & Hassan, A. E. (Eds.) Proceedings of mining software repositories (pp. 179–180. ACM).
Keyes, J. (2002). Software engineering handbook: Taylor & Francis.
Kuhn, A., Ducasse, S., & Gîrba, T. (2007). Semantic clustering identifying topics in source code. Information & Software Technology, 49(3), 230–243.
LaToza, T. D., Venolia, G., & DeLine, R. (2006). Maintaining mental models: a study of developer work habits. In Proceedings of the 28th international conference on software engineering, ICSE ’06 (pp. 492–501): ACM.
Lawrie, D., Binkley, D., & Morrell, C. (2010). Normalizing source code vocabulary. In Proceedings of working conference on reverse engineering (pp. 3–12): IEEE Computer Society.
Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to information retrieval. New York: Cambridge University Press.
McMillan, C., Grechanik, M., Poshyvanyk, D., Fu, C., & Xie, Q. (2012). Exemplar: a source code search engine for finding highly relevant applications. IEEE Transactions on Software Engineering, 38(5), 1069–1087.
Robillard, M. P., Coelho, W., & code, G. C. Murphy. (2004). How effective developers investigate source. An exploratory study. IEEE Transactions on Software Engineering, 30(12), 889–903.
Roehm, T., Tiarks, R., Koschke, R., & Maalej, W. (2012). How do professional developers comprehend software?. In Proceedings of the 2012 international conference on software engineering, ICSE 2012 (pp. 255–265). Piscataway, NJ, USA: IEEE Press.
Salviulo, F., & Scanniello, G. (2014). Dealing with identifiers and comments in source code comprehension and maintenance: Results from an ethnographically-informed study with students and professionals. In Proceedings of International Conference on Evaluation and Assessment in Software Engineering (pp. 423–432): ACM Press.
Scanniello, G., Marcus, A., & Pascale, D. (2015). Link analysis algorithms for static concept location: an empirical assessment. Empirical Software Engineering, 20 (6), 1666–1720.
Singer, J., Lethbridge, T., Vinson, N., & Anquetil, N. (1997). An examination of software engineering work practices. In Proceedings of the conference of the centre for advanced studies on collaborative research (p. 21): IBM Press.
Soloway, E., & Ehrlich, K. (1984). Empirical studies of programming knowledge. IEEE Transactions on Software Engineering, 10(5), 595–609.
Steidl, D., Hummel, B., & Jürgens, E. (2013). Quality analysis of source code comments. In Proceedings of international conference on program comprehension (pp. 83–92): IEEE Computer Society.
Tan, L., Yuan, D., Krishna, G., & Zhou, Y. (2007). iComment: Bugs or bad comments? ACM.
Tan, S. H., Marinov, D., Tan, L., & Leavens, G. T. (2012). @tcomment: Testing javadoc comments to detect comment-code inconsistencies. In Proceedings of international conference on software testing (pp. 260–269): IEEE Computer Society.
Van Der Maaten, L. (2014). Accelerating t-sne using tree-based algorithms. Journal of Machine Learning Research, 15(1), 3221–3245.
Vapnik, V. (1995). The nature of statistical learning theory. New York: Springer.
Wohlin, C., Runeson, P., Höst, M., Ohlsson, M., Regnell, B., & Wesslén, A. (2012). Experimentation in software engineering. Computer science: Springer.
Acknowledgment
We would like to thank the annotators of our dataset and the reviewers for their precious and constructive comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Corazza, A., Maggio, V. & Scanniello, G. Coherence of comments and method implementations: a dataset and an empirical investigation. Software Qual J 26, 751–777 (2018). https://doi.org/10.1007/s11219-016-9347-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11219-016-9347-1