Applicability of Sequence Analysis Methods in Analyzing Peer-Production Systems: A Case Study in Wikidata

  • To Tu CuongEmail author
  • Claudia Müller-Birn
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10047)


Building a shared understanding of a specific area of interest is of increasing importance in today’s information-centric world. A shared understanding of a domain can be realized by building a structured knowledge base about it collaboratively. Our research is driven by the goal to understand participation patterns over time in collaborative knowledge building efforts. Consequently, we focus our study on one representative project – Wikidata. Wikidata is a free, structured knowledge base that provides structured data to Wikipedia and other Wikimedia projects. This paper builds upon previous research, where we identified six common participation patterns, i.e. roles, in Wikidata. In the research presented here, we study the applicability of sequence analysis methods by analyzing the dynamics in users’ participation patterns. The sequence analysis is judged by its ability to answer three questions: (i) “Are there any preferable role transitions in Wikidata?”; (ii) “What are the dominant dynamic participation patterns?”; (iii) “Are users who join earlier more turbulent contributors?” Our data set includes participation patterns of about 20,000 users in each month from October 2012 to October 2014. We show that sequence analysis methods are able to infer interesting role transitions in Wikidata, find dominant dynamic participation patterns, and make statistical inferences. Finally, we also discuss the significance of these results with respect to the understanding of the participation process in Wikidata.


Sequence analysis Peer-production system User behavior Wikidata 


  1. 1.
    Abbott, A., Tsay, A.: Sequence analysis and optimal matching methods in sociology review and prospect. Sociol. Methods Res. 29(1), 3–33 (2000)CrossRefGoogle Scholar
  2. 2.
    Elzinga, C.H., Liefbroer, A.C.: De-standardization of family-life trajectories of young adults: a cross-national comparison using sequence analysis. European Journal of Population/Revue européenne de Démographie 23(3), 225–250 (2007). CrossRefGoogle Scholar
  3. 3.
    Gabadinho, A., Ritschard, G., Mueller, N.S., Studer, M.: Analyzing and visualizing state sequences in R with traminer. J. Stat. Softw. 40(4), 1–37 (2011)CrossRefGoogle Scholar
  4. 4.
    Gabadinho, A., Ritschard, G., Studer, M., Müller, N.S.: Extracting and rendering representative sequences. In: Fred, A., Dietz, J.L.G., Liu, K., Filipe, J. (eds.) IC3K 2009. CCIS, vol. 128, pp. 94–106. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-19032-2_7 CrossRefGoogle Scholar
  5. 5.
    Gleave, E., Welser, H.T., Lento, T.M., Smith, M.A.: A conceptual and operational definition of’social role in online community. In: 42nd Hawaii International Conference on System Sciences, HICSS 2009, pp. 1–11. IEEE (2009)Google Scholar
  6. 6.
    Keegan, B.C., Lev, S., Arazy, O.: Analyzing organizational routines in online knowledge collaborations: a case for sequence analysis in CSCW. arXiv preprint arXiv:1508.04819 (2015)
  7. 7.
    Kittur, A., Suh, B., Pendleton, B.A., Chi, E.H.: He says, she says: conflict and coordination in wikipedia. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 453–462. ACM (2007)Google Scholar
  8. 8.
    Krötzsch, M., Vrandečić, D., Völkel, M., Haller, H., Studer, R.: Semantic wikipedia. Web Seman. Sci. Serv. Agents World Wide Web 5(4), 251–261 (2007)CrossRefGoogle Scholar
  9. 9.
    McVicar, D., Anyadike-Danes, M.: Predicting successful and unsuccessful transitions from school to work by using sequence methods. J. Roy. Stat. Soc.: Ser. A (Appl. Stat.) 165(2), 317–334 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Müller-Birn, C., Karran, B., Luczak-Rösch, M., Lehmann, J.: Peer-production system or collaborative ontology development effort: what is wikidata? (2015)Google Scholar
  11. 11.
    Preece, J., Shneiderman, B.: The reader-to-leader framework: motivating technology-mediated social participation. AIS Trans. Hum. Comput. Interact. 1(1), 13–32 (2009)Google Scholar
  12. 12.
    Tudorache, T., Noy, N.F., Tu, S., Musen, M.A.: Supporting collaborative ontology development in Protégé. In: Sheth, A., Staab, S., Dean, M., Paolucci, M., Maynard, D., Finin, T., Thirunarayan, K. (eds.) ISWC 2008. LNCS, vol. 5318, pp. 17–32. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-88564-1_2 CrossRefGoogle Scholar
  13. 13.
    Viegas, F.B., Wattenberg, M., Kriss, J., Van Ham, F.: Talk before you type: coordination in wikipedia. In: 40th Annual Hawaii International Conference on System Sciences, HICSS 2007, pp. 78–87. IEEE (2007)Google Scholar
  14. 14.
    Vrandečić, D., Krötzsch, M.: Wikidata: a free collaborative knowledge base. Commun. ACM 57(10), 78–85 (2014)CrossRefGoogle Scholar
  15. 15.
    Welser, H.T., Gleave, E., Fisher, D., Smith, M.: Visualizing the signatures of social roles in online discussion groups. J. Soc. Struct. 8(2), 1–32 (2007)Google Scholar
  16. 16.
    Wenger, E.: Communities of practice: learning as a social system. Syst. Thinker 9(5), 2–3 (1998)Google Scholar
  17. 17.
    Ye, Y., Kishida, K.: Toward an understanding of the motivation of open source software developers. In: 25th International Conference on Software Engineering, Proceedings, pp. 419–429. IEEE (2003)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.Freie Universität BerlinBerlinGermany

Personalised recommendations