Computational and Mathematical Organization Theory

, Volume 18, Issue 3, pp 257–279 | Cite as

Maximizing benefits from crowdsourced data

  • Geoffrey BarbierEmail author
  • Reza Zafarani
  • Huiji Gao
  • Gabriel Fung
  • Huan Liu
SI: Data to Model


Crowds of people can solve some problems faster than individuals or small groups. A crowd can also rapidly generate data about circumstances affecting the crowd itself. This crowdsourced data can be leveraged to benefit the crowd by providing information or solutions faster than traditional means. However, the crowdsourced data can hardly be used directly to yield usable information. Intelligently analyzing and processing crowdsourced information can help prepare data to maximize the usable information, thus returning the benefit to the crowd. This article highlights challenges and investigates opportunities associated with mining crowdsourced data to yield useful information, as well as details how crowdsource information and technologies can be used for response-coordination when needed, and finally suggests related areas for future research.


Crowdsourcing Event maps Community maps Crisis maps Social media Data mining Machine learning Humanitarian Aid and Disaster Relief (HADR) 



The authors wish to acknowledge the members of the Arizona State University, Data Mining and Machine Learning laboratory for their motivating influence and thought-inspiring comments and questions with reference to this topic. This work, in particular, the content of Sect. 5, was inspired by and based on an ongoing project “ASU Coordination Tracker (ACT) for Disaster Relief”. This work was funded, in part, by the Office of Naval Research (ONR), the Air Force Office of Scientific Research (AFOSR), and the OSD-T&E (Office of Secretary Defense-Test and Evaluation), Defense-Wide/PE0601120D8Z National Defense Education Program (NDEP)/BA-1, Basic Research; SMART Program Office,, Grant Number N00244-09-1-0081. This work is approved for public release, case number 88ABW-2012-1644.


  1. Agarwal N, Kumar S, Liu H, Woodward M (2009) Blogtrackers: a tool for sociologists to track and analyze blogosphere. In: Proceedings of the 3rd international AAAI conference on weblogs and social media (ICWSM). Google Scholar
  2. Agarwal N, Liu H (2009) Modeling and data mining in blogosphere. In: Synthesis lectures on data mining and knowledge discovery, vol 1. Morgan/Claypool, San Mateo. Google Scholar
  3. Agarwal N, Liu H, Tang L, Yu P (2008) Identifying the influential bloggers in a community. In: Proceedings of the international conference on web search and Web data mining. ACM, New York, pp 207–218 CrossRefGoogle Scholar
  4. Baeza-Yates R, Ribeiro-Neto B et al (1999) Modern information retrieval, vol 463. ACM, New York Google Scholar
  5. Banerjee S, Pedersen T (2002) An adapted lesk algorithm for word sense disambiguation using wordnet. In: Computational linguistics and intelligent text processing, pp 117–171 Google Scholar
  6. Bishop C (2006) Pattern recognition and machine learning, vol 4. Springer, New York Google Scholar
  7. Blodget H (2009) Who the hell writes wikipedia, anyway?
  8. Brabham D, Sanchez T, Bartholomew K (2009) Crowdsourcing public participation in transit planning: preliminary results from the next stop design case. Transportation Research Board Google Scholar
  9. Budde A, Michahelles F (2010) Towards an open product repository using playful crowdsourcing. Digitale Soziale Netze@ Jahrestagung GI, Leipzig Google Scholar
  10. Campbell M, Innovation (2009) The sinister powers of crowdsourcing.
  11. Cristian F (1996) Synchronous and asynchronous. Commun ACM 39(4):88–97 CrossRefGoogle Scholar
  12. DARPA (2012) Darpa network challenge Google Scholar
  13. Gao H, Barbier G, Goolsby R (2011a) Harnessing the crowdsourcing power of social media for disaster relief. IEEE Intell Syst 26(3):10–14. doi: 10.1109/MIS.2011.52 CrossRefGoogle Scholar
  14. Gao H, Wang X, Barbier G, Liu H (2011b) Promoting coordination for disaster relief: from crowdsourcing to coordination. In: Social computing, behavioral modeling, and prediction (SBP). Springer, Berlin Google Scholar
  15. Goolsby R (2010) Social media as crisis platform: the future of community maps/crisis maps. ACM Trans Intell Syst Technol 1(1):1–11. doi: 10.1145/1858948.1858955 Google Scholar
  16. Grosseck G, Holotescu C (2008) Can we use Twitter for educational activities. In: 4th international scientific conference, eLearning and software for education, Bucharest, Romania Google Scholar
  17. Grossman L (2009) Iran protests: Twitter, the medium of the movement. Time (June 17, 2009).,8599,1905125,00.html
  18. Han J, Kamber M (2006) Data mining: concepts and techniques. Morgan Kaufmann, San Mateo Google Scholar
  19. Honey C, Herring S (2009) Beyond microblogging: conversation and collaboration via Twitter. In: 42nd Hawaii international conference on system sciences, 2009. HICSS’09. IEEE, New York, pp 1–10 Google Scholar
  20. Howe J (2006) The rise of crowdsourcing. Wired 14.06, Retrieved 2010–10–04 Google Scholar
  21. Hughes A, Palen L (2009) Twitter adoption and use in mass convergence and emergency events. Int J Emerg Manag 6(3):248–260 CrossRefGoogle Scholar
  22. Jansen B, Zhang M, Sobel K, Chowdury A (2009) Twitter power: tweets as electronic word of mouth. J Am Soc Inf Sci Technol 60(11):2169–2188 CrossRefGoogle Scholar
  23. Jusang A, Ismail R, Boyd C (2007) A survey of trust and reputation systems for online service provision. Decis Support Syst 43(2):618–644 CrossRefGoogle Scholar
  24. Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: International conference on machine learning, Citeseer, pp 282–289 Google Scholar
  25. Li C (2007) Forrester’s new social technographics report.
  26. Libert B, Spector J (2007) We are smarter than me: how to unleash the power of crowds in your business. Wharton School Publishing Google Scholar
  27. Liu B (2007) Web data mining. Springer, Berlin Google Scholar
  28. Liu H, Motoda H (eds) (2008) Computational methods of feature selection, Chapman & Hall/CRC, Boca Raton Google Scholar
  29. Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17(3):1–12 CrossRefGoogle Scholar
  30. Liu S, Palen L (2009) Spatiotemporal mashups: a survey of current tools to inform next generation crisis support. In: Proceedings of the 6th international ISCRAM conference, Gothenburg, Sweden Google Scholar
  31. Pelleg D, Moore A (2008) X-means: extending K-means with efficient estimation of the number of clusters. In: Proceedings of the seventeenth international conference on machine learning, pp 727–734 Google Scholar
  32. Rabiner L (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2):257–286 CrossRefGoogle Scholar
  33. Sorokin A, Forsyth D (2008) Utility data annotation with amazon mechanical turk. In: IEEE computer society conference on computer vision and pattern recognition workshops, 2008. CVPRW’08. IEEE, New York, pp 1–8 CrossRefGoogle Scholar
  34. Spellman J (2010) Heading off disaster, one tweet at a time. Turner broadcasting system, Inc
  35. Surowiecki J (2009) G20 summit: how the bandwagon wrecked the wisdom of market crowds.
  36. Terranova T (2004) Network culture: politics for the information age. Pluto Press, London Google Scholar
  37. Turney P (2001) Mining the web for synonyms: Pmi-ir versus lsa on toefl. In: Proceedings of the twelfth European conference on machine learning (ecml-2001), pp 491–502 Google Scholar
  38. UNOCHA (2006) United Nations disaster assessment and coordination (UNDAC) handbook. Electronic.
  39. Von Ahn L (2007) Human computation. In: Proceedings of the 4th international conference on knowledge capture. ACM, New York, pp 5–6 CrossRefGoogle Scholar
  40. Welinder P, Perona P (2010) Online crowdsourcing: rating annotators and obtaining cost-effective labels, pp 25–32. doi: 10.1109/CVPRW.2010.5543189 Google Scholar
  41. Woods D (2008) The commercial bear hug of open source.
  42. Zafarani R, Cole WD, Liu H (2010) Sentiment propagation in social networks: a case study in live journal. In: Chai SK, Salerno JJ, Mabry PL (eds) Advances in social computing. Lecture Notes in Computer Science. Springer, Berlin, pp 413–420 CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012 2012

Authors and Affiliations

  • Geoffrey Barbier
    • 1
    Email author
  • Reza Zafarani
    • 2
  • Huiji Gao
    • 2
  • Gabriel Fung
    • 3
  • Huan Liu
    • 2
  1. 1.Air Force Research LaboratoryDaytonUSA
  2. 2.Arizona State UniversityTempeUSA
  3. 3.IGNGAB LabHong KongChina

Personalised recommendations