Abstract
Currently, with the high rate of generation of new information, it is important the traceability of its evolution. This paper studies techniques that allow analyzing the evolution of the knowledge, starting with analyzing the capabilities of the techniques to identify the patterns that represent the common information in datasets. From the “patterns,” the evolution of their characteristics over time is analyzed. The paper considers the next techniques for the problem of tracking the traceability of the patterns: LDA (Latent Dirichlet allocation), Birch (Balanced Iterative Reducing and Clustering using Hierarchies), LAMDA (Learning Algorithm for Multivariate Data Analysis), and K-means. They are used both for the initial task of grouping the data, as well as, to analyze the characteristics of the patterns, and the relevance of them in the patterns through their evolution (traceability). This paper uses different types of data sources of educational contents, and with these datasets, the topological models to describe the “patterns” generated from the grouping of the analyzed data, and their dynamics (evolution over time), are Studied (traceability). For the evaluation, the paper considers three metrics: Calinski–Harabasz Index, Davies–Bouldin Index, and Silhouette Score.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
G. Aiello, M. Enea, C. Muriana, The expected value of the traceability information. Eur. J. Oper. Res. 244(1), 176–186 (2015)
J. Aguilar, Resolution of the clustering problem using genetic algorithms. Int. J. Comput. 1(4), 237–244 (2007)
J. Beringer, E. Hüllermeier, Online clustering of parallel data streams. Data Knowl. Eng. 58(2), 180–204 (2006)
W. Barbakh, C. Fyfe, Online clustering algorithms. Int. J. Neural Syst. 18(3), 185–194 (2008)
Y.-B. Liu, J.-R. Cai, J. Yin, A.-C. Fu, Clustering text data streams. J. Comput. Sci. Technol. 23(1), 112–128 (2008)
S. Guha, R. Rastogi, K. Shim, Cure: an efficient clustering algorithm for large databases, in Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, SIGMOD’98 (Association for Computing Machinery, New York, 1998), pp. 73–84
H.T. Zadeh, R. Boostani, A novel clustering framework for stream data un nouveau cadre de classifications pour les données de flux. Can. J. Electr. Comput. Eng. 42(1), 27–33 (2019)
A. Zhou, F. Cao, Y. Yan, C. Sha, X. He, Distributed data stream clustering: a fast EM-based approach, in Proceedings - International Conference on Data Engineering (2007), pp. 736–745
E. Tafaj, G. Kasneci, W. Rosenstiel, M. Bogdan, Bayesian online clustering of eye movement data, in Eye Tracking Research and Applications Symposium (ETRA) (2012), pp. 285–288
Y. Gong, M. Pawlowski, F. Yang, L. Brandy, L. Boundev, R. Fergus, Web scale photo hash clustering on a single machine, in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 7 (2015), pp. 19–27
C. Mills, J. Escobar-Avila, A. Bhattacharya, G. Kondyukov, S. Chakraborty, S. Haiduc, Tracing with less data: active learning for classification-based traceability link recovery, in 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME) (2019), pp. 103–113
A. Mahmoud, N. Niu, S. Xu, A semantic relatedness approach for traceability link recovery, in 2012 20th IEEE International Conference on Program Comprehension (ICPC) (2012), pp. 183–192
R. Oliveto, M. Gethers, D. Poshyvanyk, A. De Lucia, On the equivalence of information retrieval methods for automated traceability link recovery, in 2010 IEEE 18th International Conference on Program Comprehension (2010), pp. 68–71
P. Huaijin, W. Jing, S. Qiwei, Improving text models with latent feature vector representations, in 2019 IEEE 13th International Conference on Semantic Computing (ICSC) (2019), pp. 154–157
Q. Liang, P. Wu, C. Huang, An efficient method for text classification task, in Proceedings of the 2019 International Conference on Big Data Engineering, BDE 2019 (ACM, New York, 2019), pp. 92–97. http://doi.acm.org/10.1145/3341620.3341631
J. Waissman, R. Sarrate, T. Escobet, J. Aguilar, B. Dahhou, Wastewater treatment process supervision by means of a fuzzy automaton model, in Proceedings of the 2000 IEEE International Symposium on Intelligent Control. Held Jointly with the 8th IEEE Mediterranean Conference on Control and Automation (Cat. No. 00CH37147) (IEEE, Piscataway, 2000), pp. 163–168
J. Aguilar-Martin, R.L. De Mantaras, The process of classification and learning the meaning of linguistic descriptors of concepts, in Approximate Reasoning in Decision Analysis, vol. 1982 (North-Holland, Amsterdam, 1982), pp. 165–175
L. Morales, C.A. Ouedraogo, J. Aguilar, C. Chassot, S. Medjiah, K. Drira, Experimental comparison of the diagnostic capabilities of classification and clustering algorithms for the QoS management in an autonomic IoT platform. Serv. Oriented Comput. Appl. 13, 199–219 (2019)
T. Zhang, R. Ramakrishnan, M. Livny, Birch: an efficient data clustering method for very large databases. ACM Sigmod Rec. 25(2), 103–114 (1996)
T. Caliński, J. Harabasz, A dendrite method for cluster analysis. Commun. Stat. Theory Methods 3(1), 1–27 (1974)
D.L. Davies, D.W. Bouldin, A cluster separation measure. IEEE Trans. Pattern Anal. Machine Intell. 2, 224–227 (1979)
P.J. Rousseeuw, Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)
D.M. Blei, A.Y. Ng, M.I. Jordan, Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
M. Röder, A. Both, A. Hinneburg, Exploring the space of topic coherence measures, in Proceedings of the Eighth ACM International Conference on Web Search and Data Mining (ACM, New York, 2015), pp. 399–408
Acknowledgements
This work has been supported by the project 64366: “Contenidos de aprendizaje inteligentes a través del uso de herramientas de Big Data, Analtica Avanzada e IA”—Ministry of Science—Government of Antioquia—Republic of Colombia.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Aguilar, J., Salazar, C., Monsalve-Pulido, J., Montoya, E., Velasco, H. (2021). Traceability Analysis of Patterns Using Clustering Techniques. In: Arabnia, H.R., Ferens, K., de la Fuente, D., Kozerenko, E.B., Olivas Varela, J.A., Tinetti, F.G. (eds) Advances in Artificial Intelligence and Applied Cognitive Computing. Transactions on Computational Science and Computational Intelligence. Springer, Cham. https://doi.org/10.1007/978-3-030-70296-0_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-70296-0_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-70295-3
Online ISBN: 978-3-030-70296-0
eBook Packages: Computer ScienceComputer Science (R0)