Initializing Agent-Based Models with Clustering Archetypes
Agent-based models are a powerful tool for predicting population level behaviors; however their performance can be sensitive to the initial simulation conditions. This paper introduces a procedure for leveraging large datasets to initialize agent-based simulations in which the population is abstracted into a set of archetypes. We show that these archetypes can be discovered using clustering and evaluate the benefits of selecting clusters based on their stability over time. Our experiments on the GitHub dataset demonstrate that simulation runs performed with the clustering archetypes are more successful at predicting large-scale activity patterns.
KeywordsAgent-based models GitHub archetypes Unsupervised learning Stable clustering
- 1.Borges, H., Hora, A., Valente, M.T.: Predicting the popularity of GitHub repositories. In: Proceedings of the International Conference on Predictive Models and Data Analytics in Software Engineering (2016)Google Scholar
- 2.Wu, Y., Kropcznyski, J., Prates, R., Carroll, J.M.: Rise of curation in GithHub. In: AAAI Conference on Human Computation and Crowdsourcing (2015)Google Scholar
- 6.Wilensky, U.: Netlogo. Technical report, Center for Connected Learning and Computer-based Modeling, Northwestern University, Evanston, IL (1999)Google Scholar