Advertisement

Knowledge and Information Systems

, Volume 45, Issue 2, pp 473–490 | Cite as

Integration of multiple network views in Wikipedia

  • Guangyu WuEmail author
  • Pádraig Cunningham
Regular Paper

Abstract

One of the challenges in network data analysis is the determination of the most informative perspective on the network to use in analysis. This is particularly an issue when the network is dynamic and is defined by events that occur over time. We present an example of such a scenario in the analysis of edit networks in Wikipedia—the networks of editors interacting on Wikipedia pages. We propose the prediction of article quality as a task that allows us to quantify the informativeness of alternative network views. We present three fundamentally different views on the data that attempt to capture structural and temporal aspects of the edit networks. We demonstrate that each view captures information that is unique to that view and propose a strategy for integrating the different sources of information.

Keywords

Social network analysis Data quality Classification Wikipedia 

Notes

Acknowledgments

This work is supported by Science Foundation Ireland Grant No. 08/SRC/I140 (Clique: Graph and Network Analysis Cluster).

References

  1. 1.
    Adler B, De Alfaro L (2007) A content-driven reputation system for the Wikipedia. In: Proceedings of the 16th international conference on World Wide Web. ACM, p 270Google Scholar
  2. 2.
    Adler B et al (2008) Measuring author contributions to the Wikipedia. In: Proceedings of the 4th international symposium on Wikis. ACM, pp 1–10Google Scholar
  3. 3.
    Allan EG et al (2009) Using network motifs to identify application protocols. In: Proceedings of the 28th IEEE conference on global telecommunications, GLOBECOM 2009. IEEE Press, Piscataway, NJ, USA, pp 4266–4272Google Scholar
  4. 4.
    Baeza-Yates R (2009) User generated content: How good is it?. In: 3rd workshop on information credibility on the web (WICOW 2009), pp 1–2Google Scholar
  5. 5.
    Becchetti L et al (2008) Efficient semi-streaming algorithms for local triangle counting in massive graphs. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, KDD 2008. ACM, New York, NY, USA, pp 16–24Google Scholar
  6. 6.
    Boykin P, Roychowdhury V (2005) Leveraging social networks to fight spam. Computer 38(4):61–68MathSciNetCrossRefGoogle Scholar
  7. 7.
    Brandes U et al (2009) Network analysis of collaboration structure in Wikipedia. In: Proceedings of the 18th international conference on World Wide Web. ACM, pp 731–740Google Scholar
  8. 8.
    Dalip DH et al. (2009) Automatic quality assessment of content created collaboratively by web communities: a case study of Wikipedia. In: Proceedings of the 9th ACM/IEEE-CS joint conference on digital libraries. ACM, pp 295–304Google Scholar
  9. 9.
    Giles J (2005) Internet encyclopaedias go head to head. Nature 438(7070):900–901CrossRefGoogle Scholar
  10. 10.
    Gunes H, Piccardi M (2005) Affect recognition from face and body: early fusion vs. late fusion. In: 2005 IEEE international conference on systems, man and cybernetics, vol 4. IEEE, pp 3437–3443Google Scholar
  11. 11.
    Holland P, Leinhardt S (1976) Local structure in social networks. Sociol Methodol 7(1):1–45Google Scholar
  12. 12.
    Iba T et al (2010) Analyzing the creative editing behavior of Wikipedia editors: through dynamic social network analysis. Proc Soc Behav Sci 2(4):6441–6456CrossRefGoogle Scholar
  13. 13.
    Jurgens D, Lu T-C (2012) Temporal motifs reveal the dynamics of editor interactions in Wikipedia. In: Breslin JG, Ellison NB, Shanahan JG, Tufekci Z (eds) ICWSM. The AAAI Press, Menlo ParkGoogle Scholar
  14. 14.
    Juszczyszyn K et al (2008) Local topology of social network based on motif analysis. In: Lovrek I, Howlett R, Jain L (eds) Knowledge-based intelligent information and engineering systems, vol 5178. Lecture notes in computer science. Springer, Berlin, pp 97–105Google Scholar
  15. 15.
    Keegan B et al (2012) Staying in the loop: structure and dynamics of Wikipedia’s breaking news collaborations. In: Proceedings of the 8th international symposium on Wikis and open collaborationGoogle Scholar
  16. 16.
    Kleinberg J (1999) Authoritative sources in a hyperlinked environment. J ACM (JACM) 46(5):604–632zbMATHMathSciNetCrossRefGoogle Scholar
  17. 17.
    Korfiatis N et al (2006) Evaluating authoritative sources using social networks: an insight from Wikipedia. Online Inf Rev 30(3):252–262CrossRefGoogle Scholar
  18. 18.
    Laniado D, Tasso R (2011) Co-authorship 2.0: patterns of collaboration in Wikipedia. In: Proceedings of the 22nd ACM conference on hypertext and hypermedia. ACM, pp 201–210Google Scholar
  19. 19.
    Lih A (2004) Wikipedia as participatory journalism: reliable sources? Metrics for evaluating collaborative media as a news resource. In: Proceedings of the 5th international symposium on online journalism, pp 16–17Google Scholar
  20. 20.
    Lipka N, Stein B (2010) Identifying featured articles in Wikipedia: writing style matters. In: Proceedings of the 19th international conference on World Wide Web. ACM, pp 1147–1148Google Scholar
  21. 21.
    Narasimhamurthy A et al (2010) Partitioning large networks without breaking communities. Knowl Inf Syst 25(2):345–369CrossRefGoogle Scholar
  22. 22.
    Noble W et al (2004) Support vector machine applications in computational biology. In: Kernel methods in computational biology, pp 71–92Google Scholar
  23. 23.
    Pržulj N (2007) Biological network comparison using graphlet degree distribution. Bioinformatics 23(2):e177–e183CrossRefGoogle Scholar
  24. 24.
    Reid F, Harrigan M (2011) An analysis of anonymity in the bitcoin system. In: PASSAT 2011, third IEEE international conference on information privacy, security, risk and trust. IEEE, pp 1318–1326Google Scholar
  25. 25.
    Surowiecki J et al (2007) The wisdom of crowds. Am J Phys 75:190CrossRefGoogle Scholar
  26. 26.
    Wu G et al (2011) Characterizing Wikipedia pages using edit network motif profiles. In: (SMUC 2011) 3rd international workshop on search and mining user-generated contents. ACM, New York, NY, USA, pp 45–52Google Scholar
  27. 27.
    Wu G et al. (2012) Classifying Wikipedia articles using network motif counts and ratios. In: (WikiSym 2012) 8th international symposium on Wikis and open collaborationGoogle Scholar
  28. 28.
    Wu X et al (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37CrossRefGoogle Scholar
  29. 29.
    Zhang R, Tran T (2011) An information gain-based approach for recommending useful product reviews. Knowl Inf Syst 26(3):419–434CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London 2014

Authors and Affiliations

  1. 1.School of Computer Science and InformaticsUniversity College DublinDublinIreland

Personalised recommendations