Skip to main content

Genome of Human-Enabled Big Data Analytics

  • Conference paper
  • First Online:
  • 667 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 891))

Abstract

Nowadays, we live in the big data era, the gigantic information resource that Organizations are so interested to analyze in order to know their customers better, provide better services and increase their income. Recently, after the emergence of crowdsourcing techniques, people are also involved in the process of big data analysis. Inclusion of human computing power, while helps improve the quality of the results, can raise serious challenges that needs deeper investigations. In this paper, we propose a genome for human-enabled big data analytics, to understand them better, and to study where and why people get involved in such systems. We then study the challenges that raise as the result of such an involvement and propose some future research directions.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://www.crowdflower.com/solutions/open-source-big-data-analytics.

  2. 2.

    imdb.com.

References

  1. Gartner says 8.4 billion connected “things” will be in use in 2017, up 31 percent from 2016. https://goo.gl/qnQa1b. Accessed 22 Jan 2019

  2. Cuzzocrea, A., Song, I.-Y., Davis, K.C.: Analytics over large-scale multidimensional data: the big data revolution!. In: Proceedings of the ACM 14th International Workshop on Data Warehousing and OLAP, pp. 101–104. ACM (2011)

    Google Scholar 

  3. Srinivasa, S., Bhatnagar, V.: Big Data Analytics: Proceedings of the First International Conference on Big Data Analytics BDA, pp. 24–26. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35542-4

    Google Scholar 

  4. Manyika, J., et al.: Big data: the next frontier for innovation, competition, and productivity (2011)

    Google Scholar 

  5. Pike, R., Dorward, S., Griesemer, R., Quinlan, S.: Interpreting the data: parallel analysis with Sawzall. Sci. Program. 13(4), 277–298 (2005)

    Google Scholar 

  6. Sakr, S., Liu, A., Batista, D.M., Alomari, M.: A survey of large scale data management approaches in cloud environments. IEEE Commun. Surv. Tutor. 13(3), 311–336 (2011)

    Article  Google Scholar 

  7. Little, G., Sun, Y.-A.: Human OCR: insights from a complex human computation process. In: Workshop on Crowdsourcing and Human Computation, Services, Studies and Platforms, ACM CHI (2011)

    Google Scholar 

  8. The power of crowdsourcing. https://goo.gl/KH4y4p. Accessed 14 Mar 2017

  9. Twitter taught Microsoft’s AI chatbot to be a racist asshole in less than a day. https://goo.gl/kJKr7X. Accessed 26 Mar 2017

  10. Chen, M., Mao, S., Liu, Y.: Big data: a survey. Mob. Netw. Appl. 19(2), 171–209 (2014)

    Article  Google Scholar 

  11. Fan, J., Han, F., Liu, H.: Challenges of big data analysis. Natl. Sci. Rev. 1(2), 293–314 (2014)

    Article  Google Scholar 

  12. Chen, C.P., Zhang, C.-Y.: Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf. Sci. 275, 314–347 (2014)

    Article  Google Scholar 

  13. Lohr, S.: The age of big data (2012). http://www.nytimes.com/2012/02/12/sunday-review/big-datas-impact-in-the-world.html. Accessed 26 Mar 2017

  14. Yi, X., Liu, F., Liu, J., Jin, H.: Building a network highway for big data: architecture and challenges. IEEE Netw. 28(4), 5–13 (2014)

    Article  Google Scholar 

  15. Malone, T.W., Laubacher, R., Dellarocas, C.: The collective intelligence genome. MIT Sloan Manag. Rev. 51(3), 21 (2010)

    Google Scholar 

  16. Assunção, M.D., Calheiros, R.N., Bianchi, S., Netto, M.A., Buyya, R.: Big data computing and clouds: trends and future directions. J. Parallel Distrib. Comput. 79, 3–15 (2015)

    Article  Google Scholar 

  17. Davenport, T.H.: Analytics 3.0. Harv. Bus. Rev. 91(12), 64–+ (2013)

    Google Scholar 

  18. Lv, Z., Song, H., Basanta-Val, P., Steed, A., Jo, M.: Next-generation big data analytics: state of the art, challenges, and future research topics. IEEE Trans. Ind. Inform. 13(4), 1891–1899 (2017)

    Article  Google Scholar 

  19. Data never sleeps 4.0. https://www.domo.com/blog/2016/06/data-never-sleeps-4-0/. Accessed 14 Mar 2017

  20. Daniel, F., Kucherbaev, P., Cappiello, C., Benatallah, B., Allahbakhsh, M.: Quality control in crowdsourcing: a survey of quality attributes, assessment techniques, and assurance actions. ACM Comput. Surv. (CSUR) 51(1), 7 (2018)

    Article  Google Scholar 

  21. Ofli, F., et al.: Combining human computing and machine learning to make sense of big (aerial) data for disaster response. Big Data 4(1), 47–59 (2016)

    Article  Google Scholar 

  22. Matabos, M., et al.: Expert, crowd, students or algorithm: who holds the key to deep-sea imagery ‘big data’ processing? Methods Ecol. Evol. 8, 996–1004 (2017)

    Article  Google Scholar 

  23. O’Leary, D.E.: Embedding AI and crowdsourcing in the big data lake. IEEE Intell. Syst. 29(5), 70–73 (2014)

    Article  Google Scholar 

  24. Moretti, C., Bulosan, J., Thain, D., Flynn, P.J.: All-pairs: an abstraction for data-intensive cloud computing. In: 2008 IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2008, pp. 1–11. IEEE (2008)

    Google Scholar 

  25. Hashem, I.A.T., Yaqoob, I., Anuar, N.B., Mokhtar, S., Gani, A., Khan, S.U.: The rise of “big data” on cloud computing: review and open research issues. Inf. Syst. 47, 98–115 (2015)

    Article  Google Scholar 

  26. Chen, Y., Alspaugh, S., Katz, R.: Interactive analytical processing in big data systems: a cross-industry study of mapreduce workloads. Proc. VLDB Endow. 5(12), 1802–1813 (2012)

    Article  Google Scholar 

  27. Crotty, A., Galakatos, A., Zgraggen, E., Binnig, C., Kraska, T.: Vizdom: interactive analytics through pen and touch. Proc. VLDB Endow. 8(12), 2024–2027 (2015)

    Article  Google Scholar 

  28. Hurwitz, J., Nugent, A., Halper, F., Kaufman, M.: Big Data for Dummies. Wiley, Hoboken (2013)

    Google Scholar 

  29. Suchanek, F., Weikum, G.: Knowledge harvesting in the big-data era. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 933–938. ACM (2013)

    Google Scholar 

  30. Gandomi, A., Haider, M.: Beyond the hype: big data concepts, methods, and analytics. Int. J. Inf. Manag. 35(2), 137–144 (2015)

    Article  Google Scholar 

  31. Leskovec, J., Huttenlocher, D., Kleinberg, J.: Predicting positive and negative links in online social networks. In: Proceedings of the 19th International Conference on World Wide Web, pp. 641–650. ACM (2010)

    Google Scholar 

  32. Allahbakhsh, M., Ignjatovic, A., Benatallah, B., Foo, N., Bertino, E., et al.: Representation and querying of unfair evaluations in social rating systems. Comput. Secur. 41, 68–88 (2014)

    Article  Google Scholar 

  33. Guo, K., Tang, Y., Zhang, P.: CSF: crowdsourcing semantic fusion for heterogeneous media big data in the internet of things. Inf. Fusion 37, 77–85 (2017)

    Article  Google Scholar 

  34. Allahbakhsh, M., Ignjatovic, A., Motahari-Nezhad, H.R., Benatallah, B.: Robust evaluation of products and reviewers in social rating systems. World Wide Web 18(1), 73–109 (2015)

    Article  Google Scholar 

  35. Tsai, C.-W., Lai, C.-F., Chao, H.-C., Vasilakos, A.V.: Big data analytics: a survey. J. Big Data 2(1), 21 (2015)

    Article  Google Scholar 

  36. Meier, P.: Results of the crowdsourced search for Malaysia flight 370 (2014). https://irevolutions.org/2014/03/15/results-of-the-crowdsourced-flight-370-search/. Accessed 31 Mar 2017

  37. Xu, Z., et al.: Crowdsourcing based description of urban emergency events using social media big data. IEEE Trans. Cloud Comput. (2016). https://doi.org/10.1109/TCC.2016.2517638

  38. Wang, G., et al.: Serf and turf: crowdturfing for fun and profit. In: Proceedings of the 21st International Conference on World Wide Web, pp. 679–688. ACM (2012)

    Google Scholar 

  39. Amintoosi, H., Kanhere, S.S., Allahbakhsh, M.: Trust-based privacy-aware participant selection in-social participatory sensing. J. Inf. Secur. Appl. 20, 11–25 (2015)

    Google Scholar 

  40. Amintoosi, H., Kanhere, S.S.: A reputation framework for social participatory sensing systems. Mob. Netw. Appl. 19(1), 88–100 (2014)

    Article  Google Scholar 

  41. Allahbakhsh, M., Ignjatovic, A., Benatallah, B., Beheshti, S.-M.-R., Bertino, E., Foo, N.: Collusion detection in online rating systems. In: Ishikawa, Y., Li, J., Wang, W., Zhang, R., Zhang, W. (eds.) APWeb 2013. LNCS, vol. 7808, pp. 196–207. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37401-2_21

    Chapter  Google Scholar 

  42. Mukherjee, A., Liu, B., Glance, N.: Spotting fake reviewer groups in consumer reviews. In: Proceedings of the 21st International Conference on World Wide Web, pp. 191–200. ACM (2012)

    Google Scholar 

  43. Microsoft made a chatbot that tweets like a teen. https://goo.gl/v3uX4Y. Accessed 31 Mar 2017

  44. Salehi, N., McCabe, A., Valentine, M., Bernstein, M.: Huddler: convening stable and familiar crowd teams despite unpredictable availability. In: Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing, CSCW 2017, pp. 1700–1713. ACM, New York (2017)

    Google Scholar 

  45. Zeller, T.L.: AOL executive quits after posting of search data (2006). https://nyti.ms/2DHjkx0. Accessed 31 Mar 2017

  46. Truta, T.M., Tsikerdekis, M., Zeadally, S.: Privacy in social networks. In: Zeadally, S., Badra, M. (eds.) Privacy in a Digital, Networked World. CCN, pp. 263–289. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-08470-1_12

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Allahbakhsh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Allahbakhsh, M., Arbabi, S., Motahari-Nezhad, HR., Benatallah, B. (2019). Genome of Human-Enabled Big Data Analytics. In: Grandinetti, L., Mirtaheri, S., Shahbazian, R. (eds) High-Performance Computing and Big Data Analysis. TopHPC 2019. Communications in Computer and Information Science, vol 891. Springer, Cham. https://doi.org/10.1007/978-3-030-33495-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-33495-6_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-33494-9

  • Online ISBN: 978-3-030-33495-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics