Advertisement

Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

From crowdsourcing to crowdmining: using implicit human intelligence for better understanding of crowdsourced data

Abstract

With the development of mobile social networks, more and more crowdsourced data are generated on the Web or collected from real-world sensing. The fragment, heterogeneous, and noisy nature of online/offline crowdsourced data, however, makes it difficult to be understood. Traditional content-based analyzing methods suffer from potential issues such as computational intensiveness and poor performance. To address them, this paper presents CrowdMining. In particular, we observe that the knowledge hidden in the process of data generation, regarding individual/crowd behavior patterns (e.g., mobility patterns, community contexts such as social ties and structure) and crowd-object interaction patterns (flickering or tweeting patterns) are neglected in crowdsourced data mining. Therefore, a novel approach that leverages implicit human intelligence (implicit HI) for crowdsourced data mining and understanding is proposed. Two studies titled CrowdEvent and CrowdRoute are presented to showcase its usage, where implicit HIs are extracted either from online or offline crowdsourced data. A generic model for CrowdMining is further proposed based on a set of existing studies. Experiments based on real-world datasets demonstrate the effectiveness of CrowdMining.

This is a preview of subscription content, log in to check access.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13

Notes

  1. 1.

    http://www.mturk.com/

  2. 2.

    http:// www.crowdflower.com/

  3. 3.

    http://www.twitter.com/

  4. 4.

    http://www.wikipedia.org/

  5. 5.

    http://answers.yahoo.com/

  6. 6.

    http://www.yelp.com/

  7. 7.

    http://www.digg.com

  8. 8.

    www.youku.com

  9. 9.

    http://www.flickr.com

  10. 10.

    https://foursquare.com

References

  1. 1.

    Alivand, M., Hochmair, H., Srinivasan, S.: Analyzing how travelers choose scenic routes using route choice models. Comput. Environ. Urban. Syst. 50, 41–52 (2015)

  2. 2.

    X. Bao and R. Roy Choudhury, “Movi: mobile phone based video highlights via collaborative sensing”. In: Proceedings of the 8th ACM International Conference on Mobile Systems, Applications, and Services (MobiSys’10), 2010, pp. 357–370

  3. 3.

    Barbier, G., Zafarani, R., Gao, H., Fung, G., Liu, H.: Maximizing benefits from crowdsourced data. Comput. Math. Organ. Theory. 18(3), 257–279 (2012)

  4. 4.

    Boykin, S., Merlino, A.: Machine learning of event segmentation for news on demand. Commun. ACM. 43(2), 35–41 (2000)

  5. 5.

    J. Bragg, D. S. Weld et al., “Crowdsourcing multi-label classification for taxonomy creation”. In: Proceedings of First AAAI Conference on Human Computation and Crowdsourcing, 2013

  6. 6.

    S. Chen, M. Li, K. Ren, and C. Qiao, “Crowd map: Accurate reconstruction of indoor floor plans from crowdsourced sensorrich videos”. In: Proceedings of IEEE 35th International Conference on Distributed Computing Systems (ICDCS’15), 2015, pp. 1–10

  7. 7.

    H. Chen, B. Guo, Z. Yu, and Q. Han, “Toward real-time and cooperative mobile visual sensing and sharing”. In: Proceedings of the 35th IEEE International Conference on Computer Communications (INFOCOM’16), 2016, pp. 1359–1368

  8. 8.

    J. Cheng and M. S. Bernstein, “Flock: Hybrid crowd-machine learning classifiers”. In: Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (CSCW’15), 2015, pp. 600–611

  9. 9.

    Cooper, M., Foote, J., Girgensohn, A., Wilcox, L.: Temporal event clustering for digital photo collections. ACM Trans. Multimed. Comput. Commun. Appl. 1(3), 269–288 (2005)

  10. 10.

    J. Cranshaw, E. Toch, J. Hong, A. Kittur, and N. Sadeh, “Bridging the gap between physical location and online social networks”. In: Proceedings of the 12th ACM international conference on Ubiquitous computing (UbiComp’10). ACM, 2010, pp. 119–128

  11. 11.

    Doan, A., Ramakrishnan, R., Halevy, A.Y.: Crowdsourcing systems on the world-wide Web. Commun. ACM. 54(4), 86–96 (2011)

  12. 12.

    M. J. Franklin, D. Kossmann, T. Kraska, S. Ramesh, and R. Xin, “Crowddb: answering queries with crowdsourcing”. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data (SIGMOD’11), 2011, pp. 61–72

  13. 13.

    J. P. Gozali, M.-Y. Kan, and H. Sundaram, “Hidden markov model for event photo stream segmentation”. In: 2012 IEEE International Conference on Multimedia and Expo Workshops (ICMEW’12), 2012, pp. 25–30

  14. 14.

    X. Guo, E. C. Chan, C. Liu, K. Wu, S. Liu, and L. M. Ni, “Shopprofiler: Profiling shops with crowdsourcing data”. In: Proceedings of IEEE INFOCOM’14, 2014, pp. 1240–1248

  15. 15.

    Guo, B., Chen, H., Yu, Z., Xie, X., Huangfu, S., Zhang, D.: FlierMeet: a mobile crowdsensing system for cross-space public information reposting, tagging, and sharing. IEEE Trans. Mob. Comput. 14(10), 2020–2033 (2015)

  16. 16.

    Guo, B., Chen, H., Yu, Z., Xie, X., Zhang, D.: Picpick: a generic data selection framework for mobile crowd photography. Pers. Ubiquit. Comput. 20(3), 325–335 (2016)

  17. 17.

    Hafner, J., Sawhney, H.S., Equitz, W., Flickner, M., Niblack, W.: Efficient color histogram indexing for quadratic form distance functions. IEEE Trans. Pattern Anal. Mach. Intell. 17(7), 729–736 (1995)

  18. 18.

    Huang, W., Xiong, Y., Li, X.Y., Lin, H., Mao, X., Yang, P., Liu, Y., Wang, X.: Swadloon: direction finding and indoor localization using acoustic signal by shaking smartphones. IEEE Trans. Mob. Comput. 14(10), 2145–2157 (2015)

  19. 19.

    G. Kim and E. Xing, “Jointly aligning and segmenting multiple Web photo streams for the inference of collective photo storylines”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’13), 2013, pp. 620–627

  20. 20.

    Kosinski, M., Stillwell, D., Graepel, T.: Private traits and attributes are predictable from digital records of human behavior. Proc. Natl. Acad. Sci. 110(15), 5802–5805 (2013)

  21. 21.

    C. Lin, C. Lin, J. Li, D. Wang, Y. Chen, and T. Li, “Generating event storylines from microblogs”. In: Proceedings of the 21st ACM international conference on Information and Knowledge Management (CIKM’12), 2012, pp. 175–184

  22. 22.

    Liu, L., Wei, W., Zhao, D., Ma, H.: Urban resolution: new metric for measuring the quality of urban sensing. IEEE Trans. Mob. Comput. 14(12), 2560–2575 (2015)

  23. 23.

    Ma, H., Zhao, D., Yuan, P.: Opportunities in mobile crowd sensing. IEEE Commun. Mag. 52(8), 29–35 (2014)

  24. 24.

    A. Marcus, M. S. Bernstein, O. Badar, D. R. Karger, S. Madden, and R. C. Miller, “Twitinfo: aggregating and visualizing microblogs for event exploration”. In: Proceedings of the SIGCHI Conference on Human factors in Computing Systems (CHI’11), 2011, pp. 227–236

  25. 25.

    M. Noto and H. Sato, “A method for the shortest path search by extended dijkstra algorithm”. In: Proceedings of IEEE International Conference on Systems, Man, and Cybernetics (SMC’00), 2000, pp. 2316–2320

  26. 26.

    Ota, K., Dong, M., Gui, J., Liu, A.: QUOIN: incentive mechanisms for crowd sensing networks. IEEE Netw. 32(2), 114–119 (2018)

  27. 27.

    R. W. Ouyang, A. Srivastava, P. Prabahar, R. Roy Choudhury, M. Addicott, and F. J. McClernon, “If you see something, swipe towards it: crowdsourced event localization using smartphones”. In: Proceedings of the 2013 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp’13), 2013, pp. 23–32

  28. 28.

    Pfitzner, D., Leibbrandt, R., Powers, D.: Characterization and evaluation of similarity measures for pairs of clusterings. Knowl. Inf. Syst. 19(3), 361–394 (2009)

  29. 29.

    M. Redi, D. Quercia, L. T. Graham, and S. D. Gosling, “Like partying? your face says it all. predicting the ambiance of places with profile pictures”. arXiv preprint arXiv:1505.07522, 2015

  30. 30.

    T. Sakaki, M. Okazaki, and Y. Matsuo, “Earthquake shakes twitter users: real-time event detection by social sensors”. In: Proceedings of the 19th International Conference on World Wide Web (WWW’10), 2010, pp. 851–860

  31. 31.

    J. Staiano, B. Lepri, N. Aharony, F. Pianesi, N. Sebe, and A. Pentland, “Friends don’t lie: inferring personality traits from social network structure”. In: Proceedings of the 2012 ACM Conference on Ubiquitous Computing (UbiComp’12), 2012, pp. 321–330

  32. 32.

    R. J. Sternberg, “Handbook of Human Intelligence,” CUP Archive, 1982

  33. 33.

    A. S. Taylor, “Machine intelligence”. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2009, pp. 2109–2118

  34. 34.

    A. Torralba, K. P. Murphy, W. T. Freeman, and M. A. Rubin, “Context-based vision system for place and object recognition”. In: Ninth IEEE International Conference on Computer Vision (ICCV’13), 2003, pp. 273–280

  35. 35.

    K. Tuite, N. Snavely, D.-y. Hsiao, N. Tabing, and Z. Popovic, “Photocity: training experts at large-scale image acquisition through a competitive game”. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’11). ACM, 2011, pp. 1383–1392

  36. 36.

    Von Ahn, L., Maurer, B., McMillen, C., Abraham, D., Blum, M.: Recaptcha: human-based character recognition via Web security measures. Science. 321(5895), 1465–1468 (2008)

  37. 37.

    Y. Wang, W. Hu, Y. Wu, and G. Cao, “Smartphoto: A resourceaware crowdsourcing approach for image sensing with smartphones”. In :Proceedings of the 15th ACM international symposium on Mobile Ad hoc Networking and Computing (MobiHoc’14), 2014, pp. 113–122

  38. 38.

    Wang, J., Wang, Y., Zhang, D., Wang, L., Xiong, H., Helal, A., He, Y., Wang, F.: Fine-grained multitask allocation for participatory sensing with a shared budget. IEEE Internet Things J. 3(6), 1395–1405 (2016)

  39. 39.

    Wang, J., Wang, Y., Zhang, D., Wang, F., Xiong, H., Chen, C., Lv, Q., Qiu, Z.: Multi-task allocation in mobile crowd sensing with individual task quality assurance. IEEE Trans. Mob. Comput. 17(9), 2101–2113 (2018)

  40. 40.

    J. Wu, M. Dong, K. Ota, J. Li, and Z. Guan, “FCSS: Fog Computing Based Content-Aware Filtering for Security Services in Information Centric Social Networks”. IEEE Trans. Emerg. Top. Comput. 2017

  41. 41.

    Xu, J., Ota, K., Dong, M.: Real-time awareness scheduling for multimedia big data oriented in-memory computing. IEEE Internet Things J. 5(5), 3464–3473 (2018)

  42. 42.

    Zheng, Y.-T., Yan, S., Zha, Z.-J., Li, Y., Zhou, X., Chua, T.-S., Jain, R.: Gpsview: A scenic driving route planner. ACM Trans. Multimed. Comput. Commun. Appl. 9(1), 3 (2013)

  43. 43.

    Y. Zhong, N. J. Yuan, W. Zhong, F. Zhang, and X. Xie, “You are where you go: Inferring demographic attributes from location check-ins”. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining (WSDM’15), 2015, pp. 295–304

  44. 44.

    P. Zhou, Y. Zheng, M. Li, “How long to wait?: predicting bus arrival time with mobile phone based participatory sensing”. In: Proceedings of the 10th ACM International Conference on Mobile Systems, Applications, and Services (MobiSys’12), 2012: 379–392.

  45. 45.

    Zhou, X., Wu, B., Jin, Q.: Analysis of user network and correlation for community discovery based on topic-aware similarity and behavioral influence. IEEE Trans. Hum. Mach. Syst. 48(6), 559–571 (2018)

  46. 46.

    X. Zhou, W. Liang, K. Wang, R. Huang, and Q. Jin, “Academic Influence Aware and Multidimensional Network Analysis for Research Collaboration Navigation Based on Scholarly Big Data”. IEEE Trans. Emerg. Top. Comput. 2018

Download references

Funding

This work was partially supported by the National Key R&D Program of China(2017YFB1001803), National Basic Research Program of China (No.2015CB352400), and the National Natural Science Foundation of China (No. 61772428, 61725205).

Author information

Correspondence to Bin Guo.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Smart Computing and Cyber Technology for Cyberization

Guest Editors: Xiaokang Zhou, Flavia C. Delicato, Kevin Wang, and Runhe Huang

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Guo, B., Chen, H., Liu, Y. et al. From crowdsourcing to crowdmining: using implicit human intelligence for better understanding of crowdsourced data. World Wide Web (2019). https://doi.org/10.1007/s11280-019-00718-5

Download citation

Keywords

  • Data-centric crowdsourcing
  • Crowd mining
  • Implicit human intelligence
  • Mobile crowd sensing
  • Social media