Ties Between Mined Structural Patterns in Program and Their Identifier Names

Mashima, Yoshiki; Hirokawa, Sachio; Takeuchi, Kazuhiro

doi:10.1007/978-3-030-14815-7_28

Ties Between Mined Structural Patterns in Program and Their Identifier Names

Yoshiki Mashima¹⁸,
Sachio Hirokawa¹⁹ &
Kazuhiro Takeuchi²⁰

Conference paper
First Online: 07 March 2019

650 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11471))

Abstract

Identifier names in readable and maintainable source codes are always descriptive. These names are given based on the implicit knowledge of experienced programmers. In this paper, we propose a structural pattern mining method based on support vector machines (SVM) for source codes. We extract 1,000 method names in object-oriented source codes collected from online software repositories and create 1,000 datasets labeled by positive and negative class. The structural features used for the input feature vectors to the SVM learning are designed for representing partial characteristics in the abstract syntax tree (AST) parsed from a source code. Applying this method, we made an F1 score list of the 1,000 method names, which shows the degree of patterning of each name, by using our structural features. From the list, we confirmed structural patterns were strongly associated with specific method names. A qualitative evaluation of method names was also conducted by mapping the structural feature vector of each program example to the two-dimensional plane in the same way as a previous major study. From the evaluation, we confirmed that the contrasting structure among the programs corresponds to the names given to programs. Furthermore, we show examples of visualization of structural patterns using structural features extracted by feature selection.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Adachi, Y., Onimura, N., Yamashita, T., Hirokawa, S.: Standard measure and SVM measure for feature selection and their performance effect for text classification. In: Proceedings of iiWAS, pp. 262–266 (2016)
Google Scholar
Allamanis, M., Barr, E.T., Bird, C., Sutton, C.: Suggesting accurate method and class names. In: Proceedings of ESEC/FSE, pp. 38–49. ACM (2015)
Google Scholar
Allamanis, M., Brockschmidt, M., Khademi, M.: Learning to represent programs with graphs. In: International Conference on Learning Representations (2018)
Google Scholar
Allamanis, M., Peng, H., Sutton, C.: A convolutional attention network for extreme summarization of source code. In: International Conference on Machine Learning, pp. 2091–2100 (2016)
Google Scholar
Alon, U., Zilberstein, M., Levy, O., Yahav, E.: A general path-based representation for predicting program properties. In: Proceedings of the 39th ACM SIGPLAN Conference on PLDI, pp. 404–419. ACM (2018)
Google Scholar
Cortes, C., Vapnik, V.: Support vector networks. Mach. Learn. 20(3), 273–297 (1995)
MATH Google Scholar
Gvero, T., Kuncak, V.: Synthesizing java expressions from free-form queries. In: Proceedings OOPSLA, pp. 416–432 (2015)
Google Scholar
Iyer, S., Konstas, I., Cheung, A., Zettlemoyer, L.: Summarizing source code using a neural attention model. In: Proceedings AMACL, pp. 2073–2083. ACL (2016)
Google Scholar
Nguyen, T.T., Nguyen, A.T., Nguyen, H.A., Nguyen, T.N.: A statistical semantic language model for source code. In: Proceedings ESEC/FSE, pp. 532–542. ACM (2013)
Google Scholar
Raychev, V., Vechev, M., Yahav, E.: Code completion with statistical language models. In: Proceedings of the 35th ACM SIGPLAN Conference on PLDI, pp. 419–428. ACM (2014)
Google Scholar
Sakai, T., Hirokawa, S.: Feature words that classify problem sentence in scientific article. In: Proceedings iiWAS, pp. 360–367 (2012)
Google Scholar
Yamashita, H., Takeuchi, K., Hashimoto, K.: Word usage in programming codes for software repository mining. In: Proceedings ACIS, pp. 351–357 (2014)
Google Scholar
Yamashita, H., Takeuchi, K., Hashimoto, K.: Resolving functional ambiguities in labeled graph representation of programs: an application of dictionary construction based on software repository mining. In: Proceedings KICSS, pp. 536–545 (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Engineering, Osaka Electro-Communication University, Osaka, Japan
Yoshiki Mashima
Research Institute for Information Technology, Kyushu University, Fukuoka, Japan
Sachio Hirokawa
Faculty of Information and Communication Engineering, Osaka Electro-Communication University, Osaka, Japan
Kazuhiro Takeuchi

Authors

Yoshiki Mashima
View author publications
You can also search for this author in PubMed Google Scholar
Sachio Hirokawa
View author publications
You can also search for this author in PubMed Google Scholar
Kazuhiro Takeuchi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yoshiki Mashima .

Editor information

Editors and Affiliations

Osaka University, Toyonaka, Osaka, Japan
Hirosato Seki
Institute for Chemical Research, Kyoto University, Uji, Kyoto, Japan
Canh Hao Nguyen
Japan Advanced Institute of Science and Technology, Nomi, Ishikawa, Japan
Van-Nam Huynh
Osaka University, Toyonaka, Osaka, Japan
Masahiro Inuiguchi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mashima, Y., Hirokawa, S., Takeuchi, K. (2019). Ties Between Mined Structural Patterns in Program and Their Identifier Names. In: Seki, H., Nguyen, C., Huynh, VN., Inuiguchi, M. (eds) Integrated Uncertainty in Knowledge Modelling and Decision Making. IUKM 2019. Lecture Notes in Computer Science(), vol 11471. Springer, Cham. https://doi.org/10.1007/978-3-030-14815-7_28

Download citation

DOI: https://doi.org/10.1007/978-3-030-14815-7_28
Published: 07 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14814-0
Online ISBN: 978-3-030-14815-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics