On Expanding Abbreviated Identifiers in the Source Code

Yang, Hui; Sun, Xiaobing; Duan, Yucong; Zhao, Han; Li, Bin

doi:10.1007/978-3-319-46257-8_63

Hui Yang²¹,
Xiaobing Sun²¹,
Yucong Duan²²,
Han Zhao²¹ &
…
Bin Li²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9937))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

1847 Accesses

Abstract

Program comprehension is an important and difficult task in software development and evolution, which is costly and time-consuming. Some software abbreviated identifiers in the source code can further increase the difficulty of the program comprehension, especially for the junior developers who have less developing expertise for the software system. Moreover, a number of studies focused on applying information retrieval (IR) techniques to analyze the source code identifiers for various software maintenance tasks. These IR techniques would have difficulty in exploring abbreviations in the program. Hence, this paper proposes a novel approach to expand the abbreviations of software identifiers. The proposed approach searches the expansions of abbreviated identifiers considering the searching resources of the program and the Web. An empirical study has been evaluated and demonstrates that our approach can effectively recommend the expansions, which can not only help developers comprehend the program, but also assist IR techniques in further exploiting the natural language information in the program.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Some split words are not abbreviated identifiers, such as SymLink, the split word link is a dictionary word, not an abbreviated identifier, so we don’t need to search the expansion of link.
2.
http://www.locoy.com/.

References

von Mayrhauser, A., Vans, A.M.: Program comprehension during software maintenance and evolution. IEEE Comput. 28(8), 44–55 (1995)
Article Google Scholar
Soloway, E., Ehrlich, K.: Empirical studies of programming knowledge. IEEE Trans. Softw. Eng. 10(5), 595–609 (1984)
Article Google Scholar
Sun, W., Sun, X., Yang, H., Li, B.: WB4SP: a tool to build the word base for specific programs. In: 24th IEEE International Conference on Program Comprehension, Austin, TX, USA, 16–17 May, pp. 1–3 (2016)
Google Scholar
Binkley, D., Davis, M., Lawrie, D., Maletic, J.I., Morrell, C., Sharif, B.: The impact of identifier style on effort and comprehension. Empir. Softw. Eng. 18(2), 219–276 (2013)
Article Google Scholar
Wang, X., Pollock, L.L., Vijay-Shanker, K.: Automatic segmentation of method code into meaningful blocks: design and evaluation. J. Softw. Evol. Process 26(1), 27–49 (2014)
Article Google Scholar
Mens, T., Serebrenik, A., Cleve, A.: Evolving Software Systems. Springer, Heidelberg (2014)
Book Google Scholar
Liu, Y., Sun, X., Duan, Y.: Analyzing program readability based on wordnet. In: Proceedings of the 19th International Conference on Evaluation, Assessment in Software Engineering, Nanjing, China, 27–29 April 2015 (2015). Article No. 27: Observation of strains. Infect Dis Ther. 3(1), 35–43.:1–27:2
Google Scholar
Guerrouj, L., Bourque, D., Rigby, P.C.: Leveraging informal documentation to summarize classes and methods in context. In: 37th IEEE/ACM International Conference on Software Engineering, ICSE, Florence, Italy, 16–24 May, vol. 2, pp. 639–642 (2015)
Google Scholar
Guerrouj, L.: Normalizing source code vocabulary to support program comprehension and software quality. In: 35th International Conference on Software Engineering, San Francisco, CA, USA, 18–26 May, pp. 1385–1388 (2013)
Google Scholar
Sun, X., Li, B., Leung, H., Li, B., Li, Y.: MSR4SM: using topic models to effectively mining software repositories for software maintenance tasks. Inf. Softw. Technol. 66, 1–12 (2015)
Article Google Scholar
Fritz, T., Murphy, G.C., Murphy-Hill, E.R., Jingwen, O., Hill, E.: Degree-of-knowledge: modeling a developer’s knowledge of code. ACM Trans. Softw. Eng. Methodol. 23(2), 14 (2014)
Article Google Scholar
Panichella, A., Dit, B., Oliveto, R., Penta, M.D., Poshyvanyk, D., De, Lucia, A.: How to effectively use topic models for software engineering tasks? an approach based on genetic algorithms. In: 35th International Conference on Software Engineering, pp. 522–531 (2013)
Google Scholar
Sun, X., Li, B., Li, Y., Chen, Y.: What information in software historical repositories do we need to support software maintenance tasks? an approach based on topic model. In: Lee, R. (ed.) Computer and Information Science. SCI, vol. 566, pp. 27–37. Springer, Heidelberg (2015). doi:10.1007/978-3-319-10509-3_3
Google Scholar
Hu, J., Sun, X., Lo, D., Li, B.: Modeling the evolution of development topics using dynamic topic models. In: 22nd IEEE International Conference on Software Analysis, Evolution, and Reengineering, Montreal, QC, Canada, 2–6 March, pp. 3–12 (2015)
Google Scholar
Lu, M., Sun, X., Wang, S., Lo, D., Duan, Y.: Query expansion via wordnet for effective code search. In: 22nd IEEE International Conference on Software Analysis, Evolution, and Reengineering, Montreal, QC, Canada, 2–6 March, pp. 545–549 (2015)
Google Scholar
Hill, E., Binkley, D., Lawrie, D.J., Pollock, L.L., Vijay-Shanker, K.: An empirical study of identifier splitting techniques. Empir. Softw. Eng. 19(6), 1754–1780 (2014)
Article Google Scholar
Binkley, D., Lawrie, D., Uehlinger, C.: Vocabulary normalization improves ir-based concept location. In: 28th IEEE International Conference on Software Maintenance, Trento, Italy, 23–28 September, pp. 588–591 (2012)
Google Scholar
Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S., Dahlgren, N.L., Zue, V.: Timit Acoustic-Phonetic Continuous Speech Corpus, vol. 33. Linguistic data consortium, Philadelphia (1993)
Google Scholar
Sun, X., Liu, X., Hu, J., Zhu, J.: Empirical studies on the nlp techniques for source code data preprocessing. In: Proceedings of the International Workshop on Evidential Assessment of Software Technologies, pp. 32–39 (2014)
Google Scholar
Larkey, L.S., Paul Ogilvie, M., Price, A., Tamilio, B.: Acrophile: an automated acronym extractor and server. In: Proceedings of the Fifth ACM Conference on Digital Libraries, pp. 205–214. ACM (2000)
Google Scholar
Pakhomov, S.: Semi-supervised maximum entropy based approach to acronym and abbreviation normalization in medical texts. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 160–167. Association for Computational Linguistics (2002)
Google Scholar
Anquetil, N., Kulesza, U., Mitschke, R., Moreira, A., Royer, J.-C., Rummler, A., Sousa, A.: A model-driven traceability framework for software product lines. Softw. Syst. Model. 9(4), 427–451 (2010)
Article Google Scholar

Download references

Acknowledgments

This work is supported partially by Natural Science Foundation of China under Grant No. 61402396 and No. 61472344, partially by the Natural Science Foundation of the Jiangsu Higher Education Institutions of China under Grant No. 13KJB520027, and partially by the Jiangsu Qin Lan Project.

Author information

Authors and Affiliations

School of Information Engineering, Yangzhou University, Yangzhou, China
Hui Yang, Xiaobing Sun, Han Zhao & Bin Li
Hainan University, Haikou, China
Yucong Duan

Authors

Hui Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaobing Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yucong Duan
View author publications
You can also search for this author in PubMed Google Scholar
Han Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Bin Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaobing Sun .

Editor information

Editors and Affiliations

University of Manchester, Manchester, United Kingdom
Hujun Yin
Nanjing University, Nanjing, China
Yang Gao
Yangzhou University, Yangzhou, Jiangsu, China
Bin Li
Aeronautics and Astronautics, Nanjing University Aeronautics and Astronautics, Nanjing, China
Daoqiang Zhang
Nanjing Normal University, Nanjing, China
Ming Yang
Yangzhou University, Yangzhou, Jiangsu, China
Yun Li
Ostfalia University of Applied Sciences, Wolfenbüttel, Germany
Frank Klawonn
University of Seville, Seville, Spain
Antonio J. Tallón-Ballesteros

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, H., Sun, X., Duan, Y., Zhao, H., Li, B. (2016). On Expanding Abbreviated Identifiers in the Source Code. In: Yin, H., et al. Intelligent Data Engineering and Automated Learning – IDEAL 2016. IDEAL 2016. Lecture Notes in Computer Science(), vol 9937. Springer, Cham. https://doi.org/10.1007/978-3-319-46257-8_63

Download citation

DOI: https://doi.org/10.1007/978-3-319-46257-8_63
Published: 13 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46256-1
Online ISBN: 978-3-319-46257-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics