Automatic Quality Assessment of Source Code Comments: The JavadocMiner

  • Ninus Khamis
  • René Witte
  • Juergen Rilling
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6177)


An important software engineering artefact used by developers and maintainers to assist in software comprehension and maintenance is source code documentation. It provides insights that help software engineers to effectively perform their tasks, and therefore ensuring the quality of the documentation is extremely important. Inline documentation is at the forefront of explaining a programmer’s original intentions for a given implementation. Since this documentation is written in natural language, ensuring its quality needs to be performed manually. In this paper, we present an effective and automated approach for assessing the quality of inline documentation using a set of heuristics, targeting both quality of language and consistency between source code and its comments. We apply our tool to the different modules of two open source applications (ArgoUML and Eclipse), and correlate the results returned by the analysis with bug defects reported for the individual modules in order to determine connections between documentation and code quality.


Source Code Natural Language Processing Open Source Project Return Type Java Source Code 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Fluri, B., Würsch, M., Gall, H.: Do Code and Comments Co-Evolve? On the Relation between Source Code and Comment Changes. In: WCRE, pp. 70–79 (2007)Google Scholar
  2. 2.
    Nurvitadhi, E., Leung, W.W., Cook, C.: Do class comments aid Java program understanding? Frontiers in Education (FIE) 1 (November 2003)Google Scholar
  3. 3.
    Padioleau, Y., Tan, L., Zhou, Y.: Listening to programmers Taxonomies and characteristics of comments in operating system code. In: ICSE 2009, Washington, DC, USA, pp. 331–341. IEEE Computer Society, Los Alamitos (2009)Google Scholar
  4. 4.
    Knuth, D.E.: Literate Programming. The Computer Journal 27(2), 97–111 (1984)zbMATHCrossRefGoogle Scholar
  5. 5.
    Brooks, R.E.: Towards a Theory of the Comprehension of Computer Programs. International Journal of Man-Machine Studies 18(6), 543–554 (1983)CrossRefMathSciNetGoogle Scholar
  6. 6.
    Kramer, D.: API documentation from source code comments: a case study of Javadoc. In: SIGDOC 1999: Proceedings of the 17th annual international conference on Computer documentation, pp. 147–153. ACM, New York (1999)CrossRefGoogle Scholar
  7. 7.
    van Heesch, D.: Doxygen (2010),
  8. 8.
    Lehman, M.M., Belady, L.A. (eds.): Program evolution: processes of software change. Academic Press Professional, Inc., San Diego (1985)Google Scholar
  9. 9.
    Schreck, D., Dallmeier, V., Zimmermann, T.: How documentation evolves over time. In: IWPSE 2007: Ninth international workshop on Principles of software evolution, pp. 4–10. ACM, New York (2007)Google Scholar
  10. 10.
    Sun Microsystems: How to Write Doc Comments for the Javadoc Tool,
  11. 11.
    DuBay, W.H.: The Principles of Readability. Impact Information (2004)Google Scholar
  12. 12.
    Khamis, N., Witte, R., Rilling, J.: Generating an NLP Corpus from Java Source Code: The SSL Javadoc Doclet. In: New Challenges for NLP Frameworks (2010)Google Scholar
  13. 13.
    Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: A framework and graphical development environment for robust NLP tools and applications. In: Proc. of the 40th Anniversary Meeting of the ACL (2002)Google Scholar
  14. 14.
    Ryan, K., Fast, G.: Java Fathom,
  15. 15.
    Witte, R., Khamis, N., Rilling, J.: Flexible Ontology Population from Text: The OwlExporter. In: Int. Conf. on Language Resources and Evaluation, LREC (2010)Google Scholar
  16. 16.
    Buse, R.P.L., Weimer, W.R.: A metric for software readability. In: ISSTA 2008: Proceedings of the 2008 international symposium on Software testing and analysis, pp. 121–130. ACM, New York (2008)CrossRefGoogle Scholar
  17. 17.
    Abebe, S.L., Haiduc, S., Marcus, A., Tonella, P., Antoniol, G.: Analyzing the Evolution of the Source Code Vocabulary. In: European Conference on Software Maintenance and Reengineering, pp. 189–198 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Ninus Khamis
    • 1
  • René Witte
    • 1
  • Juergen Rilling
    • 1
  1. 1.Department of Computer Science and Software EngineeringConcordia UniversityMontréalCanada

Personalised recommendations