Skip to main content

From crowdsourcing to crowdmining: using implicit human intelligence for better understanding of crowdsourced data

Abstract

With the development of mobile social networks, more and more crowdsourced data are generated on the Web or collected from real-world sensing. The fragment, heterogeneous, and noisy nature of online/offline crowdsourced data, however, makes it difficult to be understood. Traditional content-based analyzing methods suffer from potential issues such as computational intensiveness and poor performance. To address them, this paper presents CrowdMining. In particular, we observe that the knowledge hidden in the process of data generation, regarding individual/crowd behavior patterns (e.g., mobility patterns, community contexts such as social ties and structure) and crowd-object interaction patterns (flickering or tweeting patterns) are neglected in crowdsourced data mining. Therefore, a novel approach that leverages implicit human intelligence (implicit HI) for crowdsourced data mining and understanding is proposed. Two studies titled CrowdEvent and CrowdRoute are presented to showcase its usage, where implicit HIs are extracted either from online or offline crowdsourced data. A generic model for CrowdMining is further proposed based on a set of existing studies. Experiments based on real-world datasets demonstrate the effectiveness of CrowdMining.

This is a preview of subscription content, access via your institution.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13

Notes

  1. http://www.mturk.com/

  2. http:// www.crowdflower.com/

  3. http://www.twitter.com/

  4. http://www.wikipedia.org/

  5. http://answers.yahoo.com/

  6. http://www.yelp.com/

  7. http://www.digg.com

  8. www.youku.com

  9. http://www.flickr.com

  10. https://foursquare.com

References

  1. Alivand, M., Hochmair, H., Srinivasan, S.: Analyzing how travelers choose scenic routes using route choice models. Comput. Environ. Urban. Syst. 50, 41–52 (2015)

    Article  Google Scholar 

  2. X. Bao and R. Roy Choudhury, “Movi: mobile phone based video highlights via collaborative sensing”. In: Proceedings of the 8th ACM International Conference on Mobile Systems, Applications, and Services (MobiSys’10), 2010, pp. 357–370

  3. Barbier, G., Zafarani, R., Gao, H., Fung, G., Liu, H.: Maximizing benefits from crowdsourced data. Comput. Math. Organ. Theory. 18(3), 257–279 (2012)

    Article  Google Scholar 

  4. Boykin, S., Merlino, A.: Machine learning of event segmentation for news on demand. Commun. ACM. 43(2), 35–41 (2000)

    Article  Google Scholar 

  5. J. Bragg, D. S. Weld et al., “Crowdsourcing multi-label classification for taxonomy creation”. In: Proceedings of First AAAI Conference on Human Computation and Crowdsourcing, 2013

  6. S. Chen, M. Li, K. Ren, and C. Qiao, “Crowd map: Accurate reconstruction of indoor floor plans from crowdsourced sensorrich videos”. In: Proceedings of IEEE 35th International Conference on Distributed Computing Systems (ICDCS’15), 2015, pp. 1–10

  7. H. Chen, B. Guo, Z. Yu, and Q. Han, “Toward real-time and cooperative mobile visual sensing and sharing”. In: Proceedings of the 35th IEEE International Conference on Computer Communications (INFOCOM’16), 2016, pp. 1359–1368

  8. J. Cheng and M. S. Bernstein, “Flock: Hybrid crowd-machine learning classifiers”. In: Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (CSCW’15), 2015, pp. 600–611

  9. Cooper, M., Foote, J., Girgensohn, A., Wilcox, L.: Temporal event clustering for digital photo collections. ACM Trans. Multimed. Comput. Commun. Appl. 1(3), 269–288 (2005)

    Article  Google Scholar 

  10. J. Cranshaw, E. Toch, J. Hong, A. Kittur, and N. Sadeh, “Bridging the gap between physical location and online social networks”. In: Proceedings of the 12th ACM international conference on Ubiquitous computing (UbiComp’10). ACM, 2010, pp. 119–128

  11. Doan, A., Ramakrishnan, R., Halevy, A.Y.: Crowdsourcing systems on the world-wide Web. Commun. ACM. 54(4), 86–96 (2011)

    Article  Google Scholar 

  12. M. J. Franklin, D. Kossmann, T. Kraska, S. Ramesh, and R. Xin, “Crowddb: answering queries with crowdsourcing”. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data (SIGMOD’11), 2011, pp. 61–72

  13. J. P. Gozali, M.-Y. Kan, and H. Sundaram, “Hidden markov model for event photo stream segmentation”. In: 2012 IEEE International Conference on Multimedia and Expo Workshops (ICMEW’12), 2012, pp. 25–30

  14. X. Guo, E. C. Chan, C. Liu, K. Wu, S. Liu, and L. M. Ni, “Shopprofiler: Profiling shops with crowdsourcing data”. In: Proceedings of IEEE INFOCOM’14, 2014, pp. 1240–1248

  15. Guo, B., Chen, H., Yu, Z., Xie, X., Huangfu, S., Zhang, D.: FlierMeet: a mobile crowdsensing system for cross-space public information reposting, tagging, and sharing. IEEE Trans. Mob. Comput. 14(10), 2020–2033 (2015)

    Article  Google Scholar 

  16. Guo, B., Chen, H., Yu, Z., Xie, X., Zhang, D.: Picpick: a generic data selection framework for mobile crowd photography. Pers. Ubiquit. Comput. 20(3), 325–335 (2016)

    Article  Google Scholar 

  17. Hafner, J., Sawhney, H.S., Equitz, W., Flickner, M., Niblack, W.: Efficient color histogram indexing for quadratic form distance functions. IEEE Trans. Pattern Anal. Mach. Intell. 17(7), 729–736 (1995)

    Article  Google Scholar 

  18. Huang, W., Xiong, Y., Li, X.Y., Lin, H., Mao, X., Yang, P., Liu, Y., Wang, X.: Swadloon: direction finding and indoor localization using acoustic signal by shaking smartphones. IEEE Trans. Mob. Comput. 14(10), 2145–2157 (2015)

    Article  Google Scholar 

  19. G. Kim and E. Xing, “Jointly aligning and segmenting multiple Web photo streams for the inference of collective photo storylines”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’13), 2013, pp. 620–627

  20. Kosinski, M., Stillwell, D., Graepel, T.: Private traits and attributes are predictable from digital records of human behavior. Proc. Natl. Acad. Sci. 110(15), 5802–5805 (2013)

    Article  Google Scholar 

  21. C. Lin, C. Lin, J. Li, D. Wang, Y. Chen, and T. Li, “Generating event storylines from microblogs”. In: Proceedings of the 21st ACM international conference on Information and Knowledge Management (CIKM’12), 2012, pp. 175–184

  22. Liu, L., Wei, W., Zhao, D., Ma, H.: Urban resolution: new metric for measuring the quality of urban sensing. IEEE Trans. Mob. Comput. 14(12), 2560–2575 (2015)

    Article  Google Scholar 

  23. Ma, H., Zhao, D., Yuan, P.: Opportunities in mobile crowd sensing. IEEE Commun. Mag. 52(8), 29–35 (2014)

    Article  Google Scholar 

  24. A. Marcus, M. S. Bernstein, O. Badar, D. R. Karger, S. Madden, and R. C. Miller, “Twitinfo: aggregating and visualizing microblogs for event exploration”. In: Proceedings of the SIGCHI Conference on Human factors in Computing Systems (CHI’11), 2011, pp. 227–236

  25. M. Noto and H. Sato, “A method for the shortest path search by extended dijkstra algorithm”. In: Proceedings of IEEE International Conference on Systems, Man, and Cybernetics (SMC’00), 2000, pp. 2316–2320

  26. Ota, K., Dong, M., Gui, J., Liu, A.: QUOIN: incentive mechanisms for crowd sensing networks. IEEE Netw. 32(2), 114–119 (2018)

    Article  Google Scholar 

  27. R. W. Ouyang, A. Srivastava, P. Prabahar, R. Roy Choudhury, M. Addicott, and F. J. McClernon, “If you see something, swipe towards it: crowdsourced event localization using smartphones”. In: Proceedings of the 2013 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp’13), 2013, pp. 23–32

  28. Pfitzner, D., Leibbrandt, R., Powers, D.: Characterization and evaluation of similarity measures for pairs of clusterings. Knowl. Inf. Syst. 19(3), 361–394 (2009)

    Article  Google Scholar 

  29. M. Redi, D. Quercia, L. T. Graham, and S. D. Gosling, “Like partying? your face says it all. predicting the ambiance of places with profile pictures”. arXiv preprint arXiv:1505.07522, 2015

  30. T. Sakaki, M. Okazaki, and Y. Matsuo, “Earthquake shakes twitter users: real-time event detection by social sensors”. In: Proceedings of the 19th International Conference on World Wide Web (WWW’10), 2010, pp. 851–860

  31. J. Staiano, B. Lepri, N. Aharony, F. Pianesi, N. Sebe, and A. Pentland, “Friends don’t lie: inferring personality traits from social network structure”. In: Proceedings of the 2012 ACM Conference on Ubiquitous Computing (UbiComp’12), 2012, pp. 321–330

  32. R. J. Sternberg, “Handbook of Human Intelligence,” CUP Archive, 1982

  33. A. S. Taylor, “Machine intelligence”. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2009, pp. 2109–2118

  34. A. Torralba, K. P. Murphy, W. T. Freeman, and M. A. Rubin, “Context-based vision system for place and object recognition”. In: Ninth IEEE International Conference on Computer Vision (ICCV’13), 2003, pp. 273–280

  35. K. Tuite, N. Snavely, D.-y. Hsiao, N. Tabing, and Z. Popovic, “Photocity: training experts at large-scale image acquisition through a competitive game”. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’11). ACM, 2011, pp. 1383–1392

  36. Von Ahn, L., Maurer, B., McMillen, C., Abraham, D., Blum, M.: Recaptcha: human-based character recognition via Web security measures. Science. 321(5895), 1465–1468 (2008)

    MathSciNet  Article  Google Scholar 

  37. Y. Wang, W. Hu, Y. Wu, and G. Cao, “Smartphoto: A resourceaware crowdsourcing approach for image sensing with smartphones”. In :Proceedings of the 15th ACM international symposium on Mobile Ad hoc Networking and Computing (MobiHoc’14), 2014, pp. 113–122

  38. Wang, J., Wang, Y., Zhang, D., Wang, L., Xiong, H., Helal, A., He, Y., Wang, F.: Fine-grained multitask allocation for participatory sensing with a shared budget. IEEE Internet Things J. 3(6), 1395–1405 (2016)

    Article  Google Scholar 

  39. Wang, J., Wang, Y., Zhang, D., Wang, F., Xiong, H., Chen, C., Lv, Q., Qiu, Z.: Multi-task allocation in mobile crowd sensing with individual task quality assurance. IEEE Trans. Mob. Comput. 17(9), 2101–2113 (2018)

    Article  Google Scholar 

  40. J. Wu, M. Dong, K. Ota, J. Li, and Z. Guan, “FCSS: Fog Computing Based Content-Aware Filtering for Security Services in Information Centric Social Networks”. IEEE Trans. Emerg. Top. Comput. 2017

  41. Xu, J., Ota, K., Dong, M.: Real-time awareness scheduling for multimedia big data oriented in-memory computing. IEEE Internet Things J. 5(5), 3464–3473 (2018)

    Article  Google Scholar 

  42. Zheng, Y.-T., Yan, S., Zha, Z.-J., Li, Y., Zhou, X., Chua, T.-S., Jain, R.: Gpsview: A scenic driving route planner. ACM Trans. Multimed. Comput. Commun. Appl. 9(1), 3 (2013)

    Article  Google Scholar 

  43. Y. Zhong, N. J. Yuan, W. Zhong, F. Zhang, and X. Xie, “You are where you go: Inferring demographic attributes from location check-ins”. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining (WSDM’15), 2015, pp. 295–304

  44. P. Zhou, Y. Zheng, M. Li, “How long to wait?: predicting bus arrival time with mobile phone based participatory sensing”. In: Proceedings of the 10th ACM International Conference on Mobile Systems, Applications, and Services (MobiSys’12), 2012: 379–392.

  45. Zhou, X., Wu, B., Jin, Q.: Analysis of user network and correlation for community discovery based on topic-aware similarity and behavioral influence. IEEE Trans. Hum. Mach. Syst. 48(6), 559–571 (2018)

    Article  Google Scholar 

  46. X. Zhou, W. Liang, K. Wang, R. Huang, and Q. Jin, “Academic Influence Aware and Multidimensional Network Analysis for Research Collaboration Navigation Based on Scholarly Big Data”. IEEE Trans. Emerg. Top. Comput. 2018

Download references

Funding

This work was partially supported by the National Key R&D Program of China(2017YFB1001803), National Basic Research Program of China (No.2015CB352400), and the National Natural Science Foundation of China (No. 61772428, 61725205).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bin Guo.

Additional information

This article belongs to the Topical Collection: Special Issue on Smart Computing and Cyber Technology for Cyberization

Guest Editors: Xiaokang Zhou, Flavia C. Delicato, Kevin Wang, and Runhe Huang

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Guo, B., Chen, H., Liu, Y. et al. From crowdsourcing to crowdmining: using implicit human intelligence for better understanding of crowdsourced data. World Wide Web 23, 1101–1125 (2020). https://doi.org/10.1007/s11280-019-00718-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-019-00718-5

Keywords

  • Data-centric crowdsourcing
  • Crowd mining
  • Implicit human intelligence
  • Mobile crowd sensing
  • Social media