Skip to main content

Social-collaborative determinants of content quality in online knowledge production systems: comparing Wikipedia and Stack Overflow

Abstract

Online knowledge production sites, such as Wikipedia and Stack Overflow, are dominated by small groups of contributors. How does this affect knowledge quality and production? Does the persistent presence of some key contributors among the most productive members improve the quality of the knowledge, considered in the aggregate? The paper addresses these issues by correlating week-by-week value changes in contribution unevenness, elite resilience (stickiness), and content quality. The goal is to detect if and how changes in social structural variables may influence the quality of the knowledge produced by two representative online knowledge production sites: Wikipedia and Stack Overflow. Regression analysis shows that on Stack Overflow both unevenness and elite stickiness have a curvilinear effect on quality. Quality is optimized at specific levels of elite stickiness and unevenness. At the same time, on Wikipedia, quality increases linearly with a decline in entropy, overall, and with an increase in stickiness in the maturation phase, after an entropy elite stickiness, quality of content peak is reached.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Notes

  1. https://en.wikipedia.org/wiki/Wikipedia:Featured_articles.

  2. https://ores.wikimedia.org/

  3. http://stackoverflow.com/help/privileges/.

  4. http://stackexchange.com/sites.

  5. https://dumps.wikimedia.org/enwiki/20160901/.

  6. https://archive.org/download/stackexchange.

References

  • Adler BT, de Alfaro L, Pye I, Raman V (2008) Measuring author contributions to the Wikipedia. In: Proceedings of the 4th international symposium on Wikis, ACM, p 15

  • Anderson A, Huttenlocher D, Kleinberg J, Leskovec J (2012) Discovering value from community activity on focused question answering sites: a case study of stack overflow. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 850–858

  • Arora P, Ganguly D, Jones GJ (2015) The good, the bad and their kins: Identifying questions with negative scores in stackoverflow. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining 2015, ACM, pp 1232–1239

  • Bailey KD (1985) Entropy measures of inequality. Sociol Inq 55(2):200–211

    Article  Google Scholar 

  • Baltadzhieva A, Chrupała G (2015) Predicting the quality of questions on stackoverflow. In: Proceedings of the international conference recent advances in natural language processing, pp 32–40

  • Blumenstock JE (2008) Size matters: word count as a measure of quality on Wikipedia. In: Proceedings of the 17th international conference on World Wide Web, ACM, pp 1095–1096

  • Bruno R (2010) A democracy of unequals (Ph.D. dissertation)

  • Burel G, He Y, Alani H (2012) Automatic identification of best answers in online enquiry communities. The semantic web: research and applications, pp 514–529

  • Cross T (2006) Puppy smoothies: improving the reliability of open, collaborative wikis. First Monday 11(9)

  • Dondio P, Barrett S (2007) Computational trust in Web content quality: a comparative evalution on the Wikipedia project. Informatica, pp 151–160. http://link.galegroup.com/apps/doc/A168662927/AONE?sid=googlescholar

  • Halfaker A, Sarabadani A (2016) Monthly Wikipedia article quality predictions. https://goo.gl/vyMQtr

  • Jurczyk P, Agichtein E (2007) Discovering authorities in question answer communities by using link analysis. In: Proceedings of the sixteenth ACM conference on conference on information and knowledge management, ACM, pp 919–922

  • Kane GC (2011) A multimethod study of information quality in Wiki collaboration. ACM Trans Manag Inf Syst 2(1):4:1–4:16. https://doi.org/10.1145/1929916.1929920

    Article  Google Scholar 

  • Kittur A, Kraut RE (2008) Harnessing the wisdom of crowds in Wikipedia: quality through coordination. In: Proceedings of the 2008 ACM conference on computer supported cooperative work, ACM, pp 37–46

  • Kittur A, Lee B, Kraut RE (2009) Coordination in collective intelligence: the role of team structure and task interdependence. In: Proceedings of the SIGCHI conference on human factors in computing systems, ACM, pp 1495–1504

  • Kuk G (2006) Strategic interaction and knowledge sharing in the KDE developer mailing list. Manag Sci 52(7):1031–1042

    Article  Google Scholar 

  • Liu J, Ram S (2011) Who does what: collaboration patterns in the Wikipedia and their impact on article quality. ACM Trans Manag Inf Syst 2(2):11:1–11:23. https://doi.org/10.1145/1985347.1985352

    Article  Google Scholar 

  • Liu X, Croft WB, Koll M (2005) Finding experts in community-based question-answering services. In: Proceedings of the 14th ACM international conference on Information and knowledge management, ACM, pp 315–316

  • MacLeod L (2014) Reputation on stack exchange: tag, you’re it! In: 2014 28th international conference on advanced information networking and applications workshops, pp 670–674. https://doi.org/10.1109/WAINA.2014.108

  • Matei SA, Jabal AA, Bertino E (2017) Do sticky elites produce online knowledge of higher quality? In: Proceedings of the 2017 IEEE/ACM international conference on advances in social networks analysis and mining 2017, ACM, New York, NY, USA, ASONAM ’17, pp 72–79. https://doi.org/10.1145/3110025.3110040. http://doi.acm.org/10.1145/3110025.3110040

  • Matei SA, Britt BC (2017) Structural differentiation in social media: adhocracy, entropy, and the “1 % effect”, 1st edn. Springer, Berlin

    Book  Google Scholar 

  • Matei SA, Bruno RJ (2015) Pareto’s 80/20 law and social differentiation: a social entropy perspective. Public Relat Rev 41(2):178–186

    Article  Google Scholar 

  • Matei SA, Dobrescu C (2010) Wikipedia’s neutral point of view: settling conflict through ambiguity. Inf Soc 27(1):40–51

    Article  Google Scholar 

  • Movshovitz-Attias D, Movshovitz-Attias Y, Steenkiste P, Faloutsos C (2013) Analysis of the reputation system and user contributions on a question answering website: Stackoverflow. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining, ACM, pp 886–893

  • Pal A, Chang S, Konstan JA (2012) Evolution of experts in question answering communities. In: ICWSM

  • Palloff RM, Pratt K (2010) Collaborating online: learning together in community, vol 32. Wiley, Oxford

    Google Scholar 

  • Ravi S, Pang B, Rastogi V, Kumar R (2014) Great question! question quality in community Q&A. In: ICWSM

  • Shah C, Pomerantz J (2010) Evaluating and predicting answer quality in community QA. In: Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval, ACM, pp 411–418

  • Shannon CE, Weaver W (1998) The mathematical theory of communication. University of Illinois Press, Illinois

    MATH  Google Scholar 

  • Slegers J (2015) The decline of stack overflow. https://hackernoon.com/the-decline-of-stack-overflow-7cb69faa575d

  • Stvilia B, Twidale MB, Smith LC, Gasser L (2008) Information quality work organization in Wikipedia. J Assoc Inf Sci Technol 59(6):983–1001

    Article  Google Scholar 

  • Stvilia B, Twidale MB, Smith LC, Gasser L (2005) Assessing information quality of a community-based encyclopedia. In: IQ

  • Warncke-Wang M, Ayukaev VR, Hecht B, Terveen LG (2015) The success and failure of quality improvement projects in peer production communities. In: Proceedings of the 18th ACM conference on computer supported cooperative work and social computing, ACM, pp 743–756

  • Wöhner T, Peters R (2009) Assessing the quality of wikipedia articles with lifecycle based metrics. In: Proceedings of the 5th international symposium on wikis and open collaboration, ACM, New York, NY, USA, WikiSym ’09, pp 16:1–16:10. https://doi.org/10.1145/1641309.1641333. http://doi.acm.org/10.1145/1641309.1641333

  • Wu G, Harrigan M, Cunningham P (2012) Classifying Wikipedia articles using network motif counts and ratios. In: Proceedings of the eighth annual international symposium on wikis and open collaboration, ACM, New York, NY, USA, WikiSym ’12, pp 12:1–12:10. https://doi.org/10.1145/2462932.2462948. http://doi.acm.org/10.1145/2462932.2462948

  • Zeng H, Alhossaini MA, Ding L, Fikes R, McGuinness DL (2006) Computing trust from revision history. Tech. rep, Stanford Univ Ca Knowledge Systems LAB

Download references

Acknowledgements

The work reported in this paper has been partially supported by NSF under Grants IIS-1636891 and ACI-1547358, and by the US Army Research Laboratory and the UK Ministry of Defence under Agreement Number W911NF-16-3-0001. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the US Army Research Laboratory, the US Government, the UK Ministry of Defence or the UK Government. The US and UK Governments are authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amani Abu Jabal.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Matei, S.A., Abu Jabal, A. & Bertino, E. Social-collaborative determinants of content quality in online knowledge production systems: comparing Wikipedia and Stack Overflow. Soc. Netw. Anal. Min. 8, 36 (2018). https://doi.org/10.1007/s13278-018-0512-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-018-0512-3

Keywords

  • Wikipedia
  • Stack Overflow
  • Unevenness
  • Elite stickiness
  • Quality of Content