Abstract
Nowadays, we live in the big data era, the gigantic information resource that Organizations are so interested to analyze in order to know their customers better, provide better services and increase their income. Recently, after the emergence of crowdsourcing techniques, people are also involved in the process of big data analysis. Inclusion of human computing power, while helps improve the quality of the results, can raise serious challenges that needs deeper investigations. In this paper, we propose a genome for human-enabled big data analytics, to understand them better, and to study where and why people get involved in such systems. We then study the challenges that raise as the result of such an involvement and propose some future research directions.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Gartner says 8.4 billion connected “things” will be in use in 2017, up 31 percent from 2016. https://goo.gl/qnQa1b. Accessed 22 Jan 2019
Cuzzocrea, A., Song, I.-Y., Davis, K.C.: Analytics over large-scale multidimensional data: the big data revolution!. In: Proceedings of the ACM 14th International Workshop on Data Warehousing and OLAP, pp. 101–104. ACM (2011)
Srinivasa, S., Bhatnagar, V.: Big Data Analytics: Proceedings of the First International Conference on Big Data Analytics BDA, pp. 24–26. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35542-4
Manyika, J., et al.: Big data: the next frontier for innovation, competition, and productivity (2011)
Pike, R., Dorward, S., Griesemer, R., Quinlan, S.: Interpreting the data: parallel analysis with Sawzall. Sci. Program. 13(4), 277–298 (2005)
Sakr, S., Liu, A., Batista, D.M., Alomari, M.: A survey of large scale data management approaches in cloud environments. IEEE Commun. Surv. Tutor. 13(3), 311–336 (2011)
Little, G., Sun, Y.-A.: Human OCR: insights from a complex human computation process. In: Workshop on Crowdsourcing and Human Computation, Services, Studies and Platforms, ACM CHI (2011)
The power of crowdsourcing. https://goo.gl/KH4y4p. Accessed 14 Mar 2017
Twitter taught Microsoft’s AI chatbot to be a racist asshole in less than a day. https://goo.gl/kJKr7X. Accessed 26 Mar 2017
Chen, M., Mao, S., Liu, Y.: Big data: a survey. Mob. Netw. Appl. 19(2), 171–209 (2014)
Fan, J., Han, F., Liu, H.: Challenges of big data analysis. Natl. Sci. Rev. 1(2), 293–314 (2014)
Chen, C.P., Zhang, C.-Y.: Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf. Sci. 275, 314–347 (2014)
Lohr, S.: The age of big data (2012). http://www.nytimes.com/2012/02/12/sunday-review/big-datas-impact-in-the-world.html. Accessed 26 Mar 2017
Yi, X., Liu, F., Liu, J., Jin, H.: Building a network highway for big data: architecture and challenges. IEEE Netw. 28(4), 5–13 (2014)
Malone, T.W., Laubacher, R., Dellarocas, C.: The collective intelligence genome. MIT Sloan Manag. Rev. 51(3), 21 (2010)
Assunção, M.D., Calheiros, R.N., Bianchi, S., Netto, M.A., Buyya, R.: Big data computing and clouds: trends and future directions. J. Parallel Distrib. Comput. 79, 3–15 (2015)
Davenport, T.H.: Analytics 3.0. Harv. Bus. Rev. 91(12), 64–+ (2013)
Lv, Z., Song, H., Basanta-Val, P., Steed, A., Jo, M.: Next-generation big data analytics: state of the art, challenges, and future research topics. IEEE Trans. Ind. Inform. 13(4), 1891–1899 (2017)
Data never sleeps 4.0. https://www.domo.com/blog/2016/06/data-never-sleeps-4-0/. Accessed 14 Mar 2017
Daniel, F., Kucherbaev, P., Cappiello, C., Benatallah, B., Allahbakhsh, M.: Quality control in crowdsourcing: a survey of quality attributes, assessment techniques, and assurance actions. ACM Comput. Surv. (CSUR) 51(1), 7 (2018)
Ofli, F., et al.: Combining human computing and machine learning to make sense of big (aerial) data for disaster response. Big Data 4(1), 47–59 (2016)
Matabos, M., et al.: Expert, crowd, students or algorithm: who holds the key to deep-sea imagery ‘big data’ processing? Methods Ecol. Evol. 8, 996–1004 (2017)
O’Leary, D.E.: Embedding AI and crowdsourcing in the big data lake. IEEE Intell. Syst. 29(5), 70–73 (2014)
Moretti, C., Bulosan, J., Thain, D., Flynn, P.J.: All-pairs: an abstraction for data-intensive cloud computing. In: 2008 IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2008, pp. 1–11. IEEE (2008)
Hashem, I.A.T., Yaqoob, I., Anuar, N.B., Mokhtar, S., Gani, A., Khan, S.U.: The rise of “big data” on cloud computing: review and open research issues. Inf. Syst. 47, 98–115 (2015)
Chen, Y., Alspaugh, S., Katz, R.: Interactive analytical processing in big data systems: a cross-industry study of mapreduce workloads. Proc. VLDB Endow. 5(12), 1802–1813 (2012)
Crotty, A., Galakatos, A., Zgraggen, E., Binnig, C., Kraska, T.: Vizdom: interactive analytics through pen and touch. Proc. VLDB Endow. 8(12), 2024–2027 (2015)
Hurwitz, J., Nugent, A., Halper, F., Kaufman, M.: Big Data for Dummies. Wiley, Hoboken (2013)
Suchanek, F., Weikum, G.: Knowledge harvesting in the big-data era. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 933–938. ACM (2013)
Gandomi, A., Haider, M.: Beyond the hype: big data concepts, methods, and analytics. Int. J. Inf. Manag. 35(2), 137–144 (2015)
Leskovec, J., Huttenlocher, D., Kleinberg, J.: Predicting positive and negative links in online social networks. In: Proceedings of the 19th International Conference on World Wide Web, pp. 641–650. ACM (2010)
Allahbakhsh, M., Ignjatovic, A., Benatallah, B., Foo, N., Bertino, E., et al.: Representation and querying of unfair evaluations in social rating systems. Comput. Secur. 41, 68–88 (2014)
Guo, K., Tang, Y., Zhang, P.: CSF: crowdsourcing semantic fusion for heterogeneous media big data in the internet of things. Inf. Fusion 37, 77–85 (2017)
Allahbakhsh, M., Ignjatovic, A., Motahari-Nezhad, H.R., Benatallah, B.: Robust evaluation of products and reviewers in social rating systems. World Wide Web 18(1), 73–109 (2015)
Tsai, C.-W., Lai, C.-F., Chao, H.-C., Vasilakos, A.V.: Big data analytics: a survey. J. Big Data 2(1), 21 (2015)
Meier, P.: Results of the crowdsourced search for Malaysia flight 370 (2014). https://irevolutions.org/2014/03/15/results-of-the-crowdsourced-flight-370-search/. Accessed 31 Mar 2017
Xu, Z., et al.: Crowdsourcing based description of urban emergency events using social media big data. IEEE Trans. Cloud Comput. (2016). https://doi.org/10.1109/TCC.2016.2517638
Wang, G., et al.: Serf and turf: crowdturfing for fun and profit. In: Proceedings of the 21st International Conference on World Wide Web, pp. 679–688. ACM (2012)
Amintoosi, H., Kanhere, S.S., Allahbakhsh, M.: Trust-based privacy-aware participant selection in-social participatory sensing. J. Inf. Secur. Appl. 20, 11–25 (2015)
Amintoosi, H., Kanhere, S.S.: A reputation framework for social participatory sensing systems. Mob. Netw. Appl. 19(1), 88–100 (2014)
Allahbakhsh, M., Ignjatovic, A., Benatallah, B., Beheshti, S.-M.-R., Bertino, E., Foo, N.: Collusion detection in online rating systems. In: Ishikawa, Y., Li, J., Wang, W., Zhang, R., Zhang, W. (eds.) APWeb 2013. LNCS, vol. 7808, pp. 196–207. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37401-2_21
Mukherjee, A., Liu, B., Glance, N.: Spotting fake reviewer groups in consumer reviews. In: Proceedings of the 21st International Conference on World Wide Web, pp. 191–200. ACM (2012)
Microsoft made a chatbot that tweets like a teen. https://goo.gl/v3uX4Y. Accessed 31 Mar 2017
Salehi, N., McCabe, A., Valentine, M., Bernstein, M.: Huddler: convening stable and familiar crowd teams despite unpredictable availability. In: Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing, CSCW 2017, pp. 1700–1713. ACM, New York (2017)
Zeller, T.L.: AOL executive quits after posting of search data (2006). https://nyti.ms/2DHjkx0. Accessed 31 Mar 2017
Truta, T.M., Tsikerdekis, M., Zeadally, S.: Privacy in social networks. In: Zeadally, S., Badra, M. (eds.) Privacy in a Digital, Networked World. CCN, pp. 263–289. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-08470-1_12
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Allahbakhsh, M., Arbabi, S., Motahari-Nezhad, HR., Benatallah, B. (2019). Genome of Human-Enabled Big Data Analytics. In: Grandinetti, L., Mirtaheri, S., Shahbazian, R. (eds) High-Performance Computing and Big Data Analysis. TopHPC 2019. Communications in Computer and Information Science, vol 891. Springer, Cham. https://doi.org/10.1007/978-3-030-33495-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-33495-6_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33494-9
Online ISBN: 978-3-030-33495-6
eBook Packages: Computer ScienceComputer Science (R0)