Facilitating Crowd Sourced Software Engineering via Stack Overflow


The open source community, as well as numerous technical blogs and community web sites, put online vast quantities of free source code, ranging from snippets to full-blown products. This code embodies the software development community’s domain knowledge, and mirrors the structure of the Internet: it is distributed rather than hierarchical; it is chaotic, incomplete, and inconsistent. StackOverflow.com is a Question and Answer (Q&A) website which uses social media to facilitate knowledge exchange between programmers by mitigating the pitfalls involved in using code from the Internet. Its design nurtures a community of developers, and enables crowd sourced software engineering activities ranging from documentation to providing useful, high quality code snippets to be used in production. In this chapter we review Stack Overflow from three perspectives: (1) its design and its social media characteristics, (2) the role it plays in the software documentation landscape, and (3) the use of Stack Overflow in the context of the example centric programming paradigm.


  1. [1].
    Adamic, L.A., Zhang, J., Bakshy, E., Ackerman, M.S.: Knowledge sharing and yahoo answers: everyone knows something. In: Proceedings of the 17th international conference on World Wide Web, WWW ’08, pp. 665–674. ACM, New York, NY, USA (2008). DOI 10.1145/1367497.1367587. URL http://doi.acm.org/10.1145/1367497.1367587
  2. [2].
    Agichtein, E., Castillo, C., Donato, D., Gionis, A., Mishne, G.: Finding high-quality content in social media. In: Proceedings of the international conference on Web search and web data mining, WSDM ’08, pp. 183–194. ACM, New York, NY, USA (2008). DOI http://doi.acm.org/10.1145/1341531.1341557. URL http://doi.acm.org/10.1145/1341531.1341557
  3. [3].
    von Ahn, L.: Human computation. In: Design Automation Conference, 2009. DAC ’09. 46th ACM/IEEE, pp. 418–419 (2009)Google Scholar
  4. [4].
    Bacchelli, A., Ponzanelli, L., Lanza, M.: Harnessing stack overflow for the ide. In: Third International Workshop on Recommendation Systems for Software Engineering (RSSE), pp. 26–30 (2012). DOI 10.1109/RSSE. 2012.6233404Google Scholar
  5. [5].
    Bajic, D., Lyons, K.: Leveraging social media to gather user feedback for software development. In: Proceedings of the 2nd International Workshop on Web 2.0 for Software Engineering, Web2SE ’11, pp. 1–6. ACM, New York, NY, USA (2011). DOI http://doi.acm.org/10.1145/1984701.1984702. URL http://doi.acm.org/10.1145/1984701.1984702
  6. [6].
    Barzilay, O.: Example embedding. In: Proceedings of the 10th SIGPLAN symposium on New ideas, new paradigms, and reflections on programming and software, ONWARD ’11, pp. 137–144. ACM, New York, NY, USA (2011). DOI 10.1145/2089131.2089135. URL http://doi.acm.org/10.1145/2089131.2089135
  7. [7].
    Barzilay, O.: Example embedding: On the diversity of example usage in professional software development. Ph.D. thesis, Tel Aviv University (2012)Google Scholar
  8. [8].
    Barzilay, O., Hazzan, O., Yehudai, A.: Using social media to study the diversity of example usage among professional developers. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering, SIGSOFT/FSE ’11, pp. 472–475. ACM, New York, NY, USA (2011). DOI http://doi.acm.org/10.1145/2025113.2025195. URL http://doi.acm.org/10.1145/2025113.2025195
  9. [9].
    Bian, J., Liu, Y., Agichtein, E., Zha, H.: Finding the right facts in the crowd: factoid question answering over social media. In: Proceedings of the 17th international conference on World Wide Web, WWW ’08, pp. 467–476. ACM, New York, NY, USA (2008). DOI 10.1145/1367497.1367561. URL http://doi.acm.org/10.1145/1367497.1367561
  10. [10].
    Bougie, G., Starke, J., Storey, M.A., German, D.M.: Towards understanding twitter use in software engineering: preliminary findings, ongoing challenges and future questions. In: Proceeding of the 2nd international workshop on Web 2.0 for software engineering, Web2SE ’11, pp. 31–36. ACM, New York, NY, USA (2011). DOI http://doi.acm.org/10.1145/1984701.1984707. URL http://doi.acm.org/10.1145/1984701.1984707
  11. [11].
    Brandt, J., Dontcheva, M., Weskamp, M., Klemmer, S.R.: Example-centric programming: integrating web search into the development environment. In: Proceedings of the 28th international conference on Human factors in computing systems, CHI ’10, pp. 513–522. ACM, New York, NY, USA (2010). DOI http://doi.acm.org/10.1145/1753326.1753402. URL http://doi.acm.org/10.1145/1753326.1753402
  12. [12].
    Fritz, T., Murphy, G.C.: Using information fragments to answer the questions developers ask. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1, ICSE ’10, pp. 175–184. ACM, New York, NY, USA (2010). DOI 10.1145/1806799.1806828. URL http://doi.acm.org/10.1145/1806799.1806828
  13. [13].
    Grechanik, M., Fu, C., Xie, Q., McMillan, C., Poshyvanyk, D., Cumby, C.: A search engine for finding highly relevant applications. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1, ICSE ’10, pp. 475–484. ACM, New York, NY, USA (2010). DOI 10.1145/1806799.1806868. URL http://doi.acm.org/10.1145/1806799.1806868
  14. [14].
    Guy, I., Zwerdling, N., Carmel, D., Ronen, I., Uziel, E., Yogev, S., Ofek-Koifman, S.: Personalized recommendation of social software items based on social relations. In: Proceedings of the third ACM conference on Recommender systems, RecSys ’09, pp. 53–60. ACM, New York, NY, USA (2009). DOI http://doi.acm.org/10.1145/1639714.1639725. URL http://doi.acm.org/10.1145/1639714.1639725
  15. [15].
    Gyongyi, Z., Koutrika, G., Pedersen, J., Garcia-Molina, H.: Questioning yahoo! answers (2007)Google Scholar
  16. [16].
    Hatcher, E., Gospodnetic, O., McCandless, M.: Lucene in Action, 2nd revised edition. edn. Manning (2010). URL http://amazon.de/o/ASIN/1933988177/
  17. [17].
    Hattori, T.: Wikigramming: a wiki-based training environment for programming. In: Proceedings of the 2nd International Workshop on Web 2.0 for Software Engineering, Web2SE ’11, pp. 7–12. ACM, New York, NY, USA (2011). DOI http://doi.acm.org/10.1145/1984701.1984703. URL http://doi.acm.org/10.1145/1984701.1984703
  18. [18].
    Holmes, R., Murphy, G.C.: Using structural context to recommend source code examples. In: ICSE ’05: Proceedings of the 27th international conference on Software engineering, pp. 117–125. ACM (2005). DOI http://doi.acm.org/10.1145/1062455.1062491Google Scholar
  19. [19].
    Kaplan, A.M., Haenlein, M.: Users of the world, unite! the challenges and opportunities of social media. Business Horizons 53(1), 59–68 (2010). DOI 10.1016/j.bushor.2009.09.003. URL http://www.sciencedirect.com/science/article/pii/S0007681309001232
  20. [20].
    Ko, A.J., DeLine, R., Venolia, G.: Information needs in collocated software development teams. In: Proceedings of the 29th international conference on Software Engineering, ICSE ’07, pp. 344–353. IEEE Computer Society, Washington, DC, USA (2007). DOI 10.1109/ICSE.2007.45. URL http://dx.doi.org/10.1109/ICSE.2007.45
  21. [21].
    Lahtinen, E., Ala-Mutka, K., Järvinen, H.M.: A study of the difficulties of novice programmers. SIGCSE Bull. 37, 14–18 (2005). DOI http://doi.acm.org/10.1145/1151954.1067453. URL http://doi.acm.org/10.1145/1151954.1067453 Google Scholar
  22. [22].
    LaToza, T.D., Myers, B.A.: Hard-to-answer questions about code. In: Evaluation and Usability of Programming Languages and Tools, PLATEAU ’10, pp. 8:1–8:6. ACM, New York, NY, USA (2010). DOI 10.1145/1937117.1937125. URL http://doi.acm.org/10.1145/1937117.1937125
  23. [23].
    Letovsky, S.: Cognitive processes in program comprehension. In: Papers presented at the first workshop on empirical studies of programmers on Empirical studies of programmers, pp. 58–79. Ablex Publishing Corp., Norwood, NJ, USA (1986). URL http://dl.acm.org/citation.cfm?id=21842.28886
  24. [24].
    Mamykina, L., Manoim, B., Mittal, M., Hripcsak, G., Hartmann, B.: Design lessons from the fastest Q&A a site in the west. In: Proceedings of the 2011 annual conference on Human factors in computing systems, CHI ’11, pp. 2857–2866. ACM, New York, NY, USA (2011). DOI http://doi.acm.org/10.1145/1978942.1979366. URL http://doi.acm.org/10.1145/1978942.1979366
  25. [25].
    McMillan, C., Poshyvanyk, D., Grechanik, M.: Recommending source code examples via api call usages and documentation. In: Proceedings of the 2nd International Workshop on Recommendation Systems for Software Engineering, RSSE ’10, pp. 21–25. ACM, New York, NY, USA (2010). DOI http://doi.acm.org/10.1145/1808920.1808925. URL http://doi.acm.org/10.1145/1808920.1808925
  26. [26].
    O’Reilly, T.: What is Web 2.0: Design patterns and business models for the next generation of software. Communications and Strategies 65(1), 17–37 (2007)Google Scholar
  27. [27].
    Parnin, C., Treude, C.: Measuring api documentation on the web. In: Proceedings of the 2nd International Workshop on Web 2.0 for Software Engineering, Web2SE ’11, pp. 25–30. ACM, New York, NY, USA (2011). DOI http://doi.acm.org/10.1145/1984701.1984706. URL http://doi.acm.org/10.1145/1984701.1984706
  28. [28].
    Robillard, P.N.: The role of knowledge in software development. Commun. ACM 42(1), 87–92 (1999). DOI 10.1145/291469.291476. URL http://doi.acm.org/10.1145/291469.291476
  29. [29].
    Shah, C., Pomerantz, J.: Evaluating and predicting answer quality in community qa. In: Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’10, pp. 411–418. ACM, New York, NY, USA (2010). DOI http://doi.acm.org/10.1145/1835449.1835518. URL http://doi.acm.org/10.1145/1835449.1835518
  30. [30].
    Sillito, J., Murphy, G.C., De Volder, K.: Questions programmers ask during software evolution tasks. In: Proceedings of the 14th ACM SIGSOFT international symposium on Foundations of software engineering, SIGSOFT ’06/FSE-14, pp. 23–34. ACM, New York, NY, USA (2006). DOI 10.1145/1181775.1181779. URL http://doi.acm.org/10.1145/1181775.1181779
  31. [31].
    Storey, M.A., Ryall, J., Singer, J., Myers, D., Cheng, L.T., Muller, M.: How software developers use tagging to support reminding and refinding. IEEE Trans. Softw. Eng. 35(4), 470–483 (2009). DOI 10.1109/TSE.2009.15. URL http://dx.doi.org/10.1109/TSE.2009.15 Google Scholar
  32. [32].
    Storey, M.A., Treude, C., van Deursen, A., Cheng, L.T.: The impact of social media on software engineering practices and tools. In: Proceedings of the FSE/SDP workshop on Future of software engineering research, FoSER ’10, pp. 359–364. ACM, New York, NY, USA (2010). DOI 10.1145/1882362.1882435. URL http://doi.acm.org/10.1145/1882362.1882435
  33. [33].
    Stylos, J., Myers, B.: Mica: A web-search tool for finding api components and examples. In: Visual Languages and Human-Centric Computing, 2006. VL/HCC 2006. IEEE Symposium on, pp. 195–202 (2006). DOI 10. 1109/VLHCC.2006.32Google Scholar
  34. [34].
    Sureka, A., Goyal, A., Rastogi, A.: Using social network analysis for mining collaboration data in a defect tracking system for risk and vulnerability analysis. In: Proceedings of the 4th India Software Engineering Conference, ISEC ’11, pp. 195–204. ACM, New York, NY, USA (2011). DOI http://doi.acm.org/10.1145/1953355.1953381. URL http://doi.acm.org/10.1145/1953355.1953381
  35. [35].
    Surowiecki, J.: The Wisdom of Crowds. Anchor (2005)Google Scholar
  36. [36].
    Thummalapenta, S., Xie, T.: Parseweb: a programmer assistant for reusing open source code on the web. In: Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering, ASE ’07, pp. 204–213. ACM, New York, NY, USA (2007). DOI http://doi.acm.org/10.1145/1321631.1321663. URL http://doi.acm.org/10.1145/1321631.1321663
  37. [37].
    Treude, C., Barzilay, O., Storey, M.A.: How do programmers ask and answer questions on the web? (nier track). In: Proceedings of the 33rd International Conference on Software Engineering, ICSE ’11, pp. 804–807. ACM, New York, NY, USA (2011). DOI http://doi.acm.org/10.1145/1985793.1985907. URL http://doi.acm.org/10.1145/1985793.1985907
  38. [38].
    Treude, C., Filho, F.F., Cleary, B., Storey, M.A.: Programming in a socially networked world: the evolution of the social programmer. In: FutureCSD ’12: Proceedings of the CSCW Workshop on the Future of Collaborative Software Development (2012)Google Scholar
  39. [39].
    Treude, C., Storey, M.A.: Work item tagging: Communicating concerns in collaborative software development. IEEE Trans. Softw. Eng. 38(1), 19–34 (2012). DOI 10.1109/TSE.2010.91. URL http://dx.doi.org/10.1109/TSE.2010.91 Google Scholar
  40. [40].
    Wu, H.C., Luk, R.W.P., Wong, K.F., Kwok, K.L.: Interpreting tf-idf term weights as making relevance decisions. ACM Trans. Inf. Syst. 26, 13:1–13:37 (2008). DOI http://doi.acm.org/10.1145/1361684.1361686. URL http://doi.acm.org/10.1145/1361684.1361686
  41. [41].
    Zagalsky, A., Barzilay, O., Yehudai, A.: Example overflow: Using social media for code recommendation. In: Third International Workshop on Recommendation Systems for Software Engineering (RSSE), pp. 38–42 (2012). DOI 10.1109/RSSE.2012.6233407Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Ohad Barzilay
    • 1
  • Christoph Treude
    • 2
  • Alexey Zagalsky
    • 1
  1. 1.Blavatnik School of Computer ScienceTel-Aviv UniversityTel-AvivIsrael
  2. 2.Department of Computer ScienceUniversity of VictoriaVictoriaCanada

Personalised recommendations