On the Need of Opening the Big Data Landscape to Everyone: Challenges and New Trends
The great variety and intrinsic complexity of current Big Data technologies hampers the development of analytic processes for large data sets in domains where their business experts are not required to have specialized knowledge in computing, such as data mining, parallel computing, machine learning or software development. New approaches are therefore necessary to simplify, promote and open to everyone the establishment of these technologies in those sectors like health, economy, market analysis, etc., where such a data processing is highly demanded but it still needs to be outsourced. In this context, workflows are conceptually closer to the business expert, and a well‐known mechanism to represent a sequence of domain‐specific activities that enable the automation of data processes, independently of the infrastructure requirements. In this chapter, we discuss the current challenges to be faced in the widespread adoption of workflow‐based Big Data processes. Further, existing workflow management tools are analyzed, as well as the new trends for the development of custom solutions in multiple domains.
- 1.R. van der Meulen and V. Woods, “Gartner survey shows more than 75 percent of companies are investing or planning to invest in Big Data in the next two years,” Gartner, 2015. [Online]. Available: http://www.gartner.com/newsroom/id/3130817.
- 2.D. Simchi-Levi, J. Gadewadikar, B. McCarthy and L. LaFiandra, “Winning with analytics,” Accenture, 2015.Google Scholar
- 3.M. Turck, “Is Big Data still a thing? (The 2016 Big Data landscape),” FirstMark Capital, 2016. [Online]. Available: http://mattturck.com/2016/02/01/big-data-landscape.
- 5.Workflow Management Coalition, “Terminology & Glossary,” 1999.Google Scholar
- 6.J. Yu and R. Buyya, “A taxonomy of workflow management systems for grid computing,” Journal of Grid Computing, vol. 3, no. 3, pp. 171–200, 2006.Google Scholar
- 7.R. Frye and M. McKenney, Information granularity, big data, and computational intelligence, Springer, 2015.Google Scholar
- 8.T. White, Hadoop: The definitive guide, O’Reilly Media, 2015.Google Scholar
- 9.IBM Software, “Data-driven healthcare organizations use big data analytics for big gains,” 2013.Google Scholar
- 10.B. Kayyali, D. Knott and S. Van Kuiken, “The ‘big data’ revolution in healthcare: Accelerating value and innovation,” McKinsey & Company, 2013.Google Scholar
- 11.D. Adamson, “Big Data in healthcare made simple: Where it stands today and where it’s going,” [Online]. Available: https://www.healthcatalyst.com/big-data-in-healthcare-made-simple. [Accessed 10 08 2016].
- 16.The White House, “Transparency and Open Government. Memorandum for the heads of executive departments and agencies,” 2009. [Online]. Available: https://www.whitehouse.gov/sites/default/files/omb/assets/memoranda_fy2009/m09-12.pdf.
- 20.Oracle Enterprise Architecture, “Improving manufacturing performance with Big Data. Architect’s guide and reference architecture introduction,” 2015.Google Scholar
- 21.M. Gaitho, “How applications of Big Data drive industries,” 2015. [Online]. Available: http://www.simplilearn.com/big-data-applications-in-industries-article.
- 22.C.-S. Neumann, “Big data versus big congestion: Using information to improve transport,” McKinsey & Company, 2015.Google Scholar
- 23.B. Marr, “Big Data: The winning formula in sports,” Forbes, 2015.Google Scholar
- 24.Y. Zhang and Y. Zhao, “Astronomy in the Big Data Era,” Data Science Journal, vol. 14, 2015.Google Scholar
- 25.Deloitte, “Opportunities in telecom sector: Arising from Big Data,” 2015.Google Scholar
- 26.M. R. Berthold, N. Cebron, F. Dill, T. R. Gabriel, T. Kötter, T. Meinl, P. Ohl, C. Sieb, K. Thiel and B. Wiswedel, “KNIME: The Konstanz Information Miner,” in Studies in Classification, Data Analysis, and Knowledge Organization, Springer, 2007.Google Scholar
- 27.M. Hofmann and R. Klinkenberg, RapidMiner: Data mining use cases and business analytics applications, Chapman & Hall/CRC, 2013.Google Scholar
- 28.K. Wolstencroft, R. Haines, D. Fellows, A. Williams, D. Withers, S. Owen, S. Soiland-Reyes, I. Dunlop, A. Nenadic, P. Fisher, J. Bhagat, K. Belhajjame and F. Bacall, “The Taverna workflow suite: Designing and executing workflows of web services on the desktop, web or in the cloud,” Nucleic Acids Research, vol. 41, no. W1, pp. W557–W561, 2013.CrossRefGoogle Scholar
- 29.KNIME, “Outlier detection in medical claims,” [Online]. Available: https://www.knime.org/knime-applications/outlier-detection-in-medical-claims. [Accessed 15 08 2016].