Abstract
This paper studies the complex network structure of software design networks. In a software design network, each node is a class (a specific part of a piece of software) and each link represents a software code-related dependency between two classes. This work provides two main contributions. First, we reveal how typical software networks exhibit a structure very similar to other real-world networks: they are sparse, scale-free and have low average node-to-node-distances. In addition, we demonstrate how various distance, network clustering and assortativity metrics can provide important insights for software engineers related to software design decisions, coupling and inter-package relationships. Second, we propose a novel network-driven method to automatically determine the role of a software class, a frequently encountered problem by software engineers trying to understand a large-scale software system. We use a role taxonomy from literature which defines six so-called archetypes of software classes, which, once assigned to a class, can provide useful insights for engineers. In this paper we train and validate a model that is able to automatically assess which of these archetypes a class belongs to. Experiments on three unique high quality network datasets of large real-world software systems demonstrate how network features are able to realize high accuracy models for this task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Version 5.304, obtained from https://github.com/k9mail/k-9/releases.
- 2.
Version 5.6, obtained from https://sourceforge.net/projects/sweethome3d.
- 3.
Version 3.1.0, obtained from https://github.com/mars-sim/mars-sim.
- 4.
The constructed feature sets can be found at https://github.com/Xavyr-R/thesis-labelled-files/tree/master/inputfeatures.
References
Barabási, A.L.: Network Science. Cambridge University Press, Cambridge (2016)
Brandes, U.: A faster algorithm for betweenness centrality. J. Math. Sociol. 25(2), 163–177 (2001)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16(1), 321–357 (2002)
Chong, C.Y., Lee, S.P.: Analyzing maintainability and reliability of object-oriented software using weighted complex network. J. Syst. Softw. 110, 28–53 (2015)
Concas, G., Marchesi, M., Murgia, A., Tonelli, R.: An empirical study of social networks metrics in object-oriented software. Adv. Softw. Eng. 2010, 4 (2010)
Dragan, N., Collard, M.L., Maletic, J.I.: Automatic identification of class stereotypes. In: Proceedings of the IEEE International Conference on Software Maintenance, pp. 1–10 (2010)
Fowler, M.: UML Distilled: A Brief Guide to the Standard Object Modeling Language. Addison-Wesley Professional (2004)
Freeman, L.C.: Centrality in social networks conceptual clarification. Soc. Netw. 1(3), 215–239 (1978)
Genero, M., Piattini, M., Calero, C.: A survey of metrics for UML class diagrams. J. Object Technol. 4(9), 59–92 (2005)
Hagberg, A., Swart, P., SÂ Chult, D.: Exploring network structure, dynamics, and function using network. Technical Report, Los Alamos National Lab. (2008)
Larman, C.: Applying UML and Patterns: An Introduction to Object-Oriented Analysis and Design and Iterative Development (3rd Edition), 3rd edn. Prentice Hall PTR, Upper Saddle River (2004)
Menze, B.H., et al.: A comparison of random forest and its Gini importancefor the feature selection and classification of spectral data. BMC Bioinform. 10(1), 213 (2009)
Myers, C.R.: Software systems as complex networks: structure, function, and evolvability of software collaboration graphs. Phys. Rev. E 68(4), 046,116 (2003)
Pawlak, R., Monperrus, M., Petitprez, N., Noguera, C., Seinturier, L.: Spoon: a library for implementing analyses and transformations of java source code. Softw.: Pract. Exp. 46(9), 1155–1179 (2015)
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Rademaker, X.T.: A network-driven feature construction approach for labelling software classes using machine learning. Technical Report, MSc thesis, Leiden University (2018)
Rifkin, R., Klautau, A.: In defense of one-vs-all classification. J. Mach. Learn. Res. 5, 101–141 (2004)
Tekin, U., Buzluca, F.: A graph mining approach for detecting identical design structures in object-oriented design models. Sci. Comput. Program. 95(P4), 406–425 (2014)
Thung, F., Lo, D., Osman, M.H., Chaudron, M.R.V.: Condensing class diagrams by analyzing design and network metrics using optimistic classification. In: Proceedings of the 22nd International Conference on Program Comprehension (2014)
Tsantalis, N., Chatzigeorgiou, A., Stephanides, G., Halkidis, S.T.: Design pattern detection using similarity scoring. IEEE Trans. Softw. Eng. 32(11), 896–909 (2006)
Wang, J., Ai, J., Yang, Y., Su, W.: Identifying key classes of object-oriented software based on software complex network. In: Proceedings of the 2nd IEEE International Conference on System Reliability and Safety, pp. 444–449 (2017)
Wirfs-Brock, R.J.: Characterizing classes. IEEE Softw. 23(2), 9–11 (2006)
Wirfs-Brock, R.J., Johnson, R.E.: Surveying current research in object-oriented design. Commun. ACM 33(9), 104–124 (1990)
Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining: Practical Machine Learning Tools and Techniques, 4th edn. Morgan Kaufmann (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Rademaker, X.T., Chaudron, M.R.V., Takes, F.W. (2019). Automatic Identification of Component Roles in Software Design Networks. In: Aiello, L., Cherifi, C., Cherifi, H., Lambiotte, R., Lió, P., Rocha, L. (eds) Complex Networks and Their Applications VII. COMPLEX NETWORKS 2018. Studies in Computational Intelligence, vol 813. Springer, Cham. https://doi.org/10.1007/978-3-030-05414-4_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-05414-4_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05413-7
Online ISBN: 978-3-030-05414-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)