Skip to main content

Traceability Analysis of Patterns Using Clustering Techniques

  • Conference paper
  • First Online:
Advances in Artificial Intelligence and Applied Cognitive Computing

Abstract

Currently, with the high rate of generation of new information, it is important the traceability of its evolution. This paper studies techniques that allow analyzing the evolution of the knowledge, starting with analyzing the capabilities of the techniques to identify the patterns that represent the common information in datasets. From the “patterns,” the evolution of their characteristics over time is analyzed. The paper considers the next techniques for the problem of tracking the traceability of the patterns: LDA (Latent Dirichlet allocation), Birch (Balanced Iterative Reducing and Clustering using Hierarchies), LAMDA (Learning Algorithm for Multivariate Data Analysis), and K-means. They are used both for the initial task of grouping the data, as well as, to analyze the characteristics of the patterns, and the relevance of them in the patterns through their evolution (traceability). This paper uses different types of data sources of educational contents, and with these datasets, the topological models to describe the “patterns” generated from the grouping of the analyzed data, and their dynamics (evolution over time), are Studied (traceability). For the evaluation, the paper considers three metrics: Calinski–Harabasz Index, Davies–Bouldin Index, and Silhouette Score.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. G. Aiello, M. Enea, C. Muriana, The expected value of the traceability information. Eur. J. Oper. Res. 244(1), 176–186 (2015)

    Article  MATH  Google Scholar 

  2. J. Aguilar, Resolution of the clustering problem using genetic algorithms. Int. J. Comput. 1(4), 237–244 (2007)

    Google Scholar 

  3. J. Beringer, E. Hüllermeier, Online clustering of parallel data streams. Data Knowl. Eng. 58(2), 180–204 (2006)

    Article  Google Scholar 

  4. W. Barbakh, C. Fyfe, Online clustering algorithms. Int. J. Neural Syst. 18(3), 185–194 (2008)

    Article  Google Scholar 

  5. Y.-B. Liu, J.-R. Cai, J. Yin, A.-C. Fu, Clustering text data streams. J. Comput. Sci. Technol. 23(1), 112–128 (2008)

    Article  Google Scholar 

  6. S. Guha, R. Rastogi, K. Shim, Cure: an efficient clustering algorithm for large databases, in Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, SIGMOD’98 (Association for Computing Machinery, New York, 1998), pp. 73–84

    Book  Google Scholar 

  7. H.T. Zadeh, R. Boostani, A novel clustering framework for stream data un nouveau cadre de classifications pour les données de flux. Can. J. Electr. Comput. Eng. 42(1), 27–33 (2019)

    Article  Google Scholar 

  8. A. Zhou, F. Cao, Y. Yan, C. Sha, X. He, Distributed data stream clustering: a fast EM-based approach, in Proceedings - International Conference on Data Engineering (2007), pp. 736–745

    Google Scholar 

  9. E. Tafaj, G. Kasneci, W. Rosenstiel, M. Bogdan, Bayesian online clustering of eye movement data, in Eye Tracking Research and Applications Symposium (ETRA) (2012), pp. 285–288

    Google Scholar 

  10. Y. Gong, M. Pawlowski, F. Yang, L. Brandy, L. Boundev, R. Fergus, Web scale photo hash clustering on a single machine, in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 7 (2015), pp. 19–27

    Google Scholar 

  11. C. Mills, J. Escobar-Avila, A. Bhattacharya, G. Kondyukov, S. Chakraborty, S. Haiduc, Tracing with less data: active learning for classification-based traceability link recovery, in 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME) (2019), pp. 103–113

    Google Scholar 

  12. A. Mahmoud, N. Niu, S. Xu, A semantic relatedness approach for traceability link recovery, in 2012 20th IEEE International Conference on Program Comprehension (ICPC) (2012), pp. 183–192

    Google Scholar 

  13. R. Oliveto, M. Gethers, D. Poshyvanyk, A. De Lucia, On the equivalence of information retrieval methods for automated traceability link recovery, in 2010 IEEE 18th International Conference on Program Comprehension (2010), pp. 68–71

    Google Scholar 

  14. P. Huaijin, W. Jing, S. Qiwei, Improving text models with latent feature vector representations, in 2019 IEEE 13th International Conference on Semantic Computing (ICSC) (2019), pp. 154–157

    Google Scholar 

  15. Q. Liang, P. Wu, C. Huang, An efficient method for text classification task, in Proceedings of the 2019 International Conference on Big Data Engineering, BDE 2019 (ACM, New York, 2019), pp. 92–97. http://doi.acm.org/10.1145/3341620.3341631

    Book  Google Scholar 

  16. J. Waissman, R. Sarrate, T. Escobet, J. Aguilar, B. Dahhou, Wastewater treatment process supervision by means of a fuzzy automaton model, in Proceedings of the 2000 IEEE International Symposium on Intelligent Control. Held Jointly with the 8th IEEE Mediterranean Conference on Control and Automation (Cat. No. 00CH37147) (IEEE, Piscataway, 2000), pp. 163–168

    Google Scholar 

  17. J. Aguilar-Martin, R.L. De Mantaras, The process of classification and learning the meaning of linguistic descriptors of concepts, in Approximate Reasoning in Decision Analysis, vol. 1982 (North-Holland, Amsterdam, 1982), pp. 165–175

    Google Scholar 

  18. L. Morales, C.A. Ouedraogo, J. Aguilar, C. Chassot, S. Medjiah, K. Drira, Experimental comparison of the diagnostic capabilities of classification and clustering algorithms for the QoS management in an autonomic IoT platform. Serv. Oriented Comput. Appl. 13, 199–219 (2019)

    Article  Google Scholar 

  19. T. Zhang, R. Ramakrishnan, M. Livny, Birch: an efficient data clustering method for very large databases. ACM Sigmod Rec. 25(2), 103–114 (1996)

    Article  Google Scholar 

  20. T. Caliński, J. Harabasz, A dendrite method for cluster analysis. Commun. Stat. Theory Methods 3(1), 1–27 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  21. D.L. Davies, D.W. Bouldin, A cluster separation measure. IEEE Trans. Pattern Anal. Machine Intell. 2, 224–227 (1979)

    Article  Google Scholar 

  22. P.J. Rousseeuw, Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)

    Article  MATH  Google Scholar 

  23. D.M. Blei, A.Y. Ng, M.I. Jordan, Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  24. M. Röder, A. Both, A. Hinneburg, Exploring the space of topic coherence measures, in Proceedings of the Eighth ACM International Conference on Web Search and Data Mining (ACM, New York, 2015), pp. 399–408

    Google Scholar 

Download references

Acknowledgements

This work has been supported by the project 64366: “Contenidos de aprendizaje inteligentes a través del uso de herramientas de Big Data, Analtica Avanzada e IA”—Ministry of Science—Government of Antioquia—Republic of Colombia.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jose Aguilar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Aguilar, J., Salazar, C., Monsalve-Pulido, J., Montoya, E., Velasco, H. (2021). Traceability Analysis of Patterns Using Clustering Techniques. In: Arabnia, H.R., Ferens, K., de la Fuente, D., Kozerenko, E.B., Olivas Varela, J.A., Tinetti, F.G. (eds) Advances in Artificial Intelligence and Applied Cognitive Computing. Transactions on Computational Science and Computational Intelligence. Springer, Cham. https://doi.org/10.1007/978-3-030-70296-0_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-70296-0_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-70295-3

  • Online ISBN: 978-3-030-70296-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics