Big Data: An Introduction

Mohanty, Hrushikesha

doi:10.1007/978-81-322-2494-5_1

Big Data: An Introduction

Hrushikesha Mohanty⁵

Chapter
First Online: 01 January 2015

12k Accesses
8 Citations

Part of the book series: Studies in Big Data ((SBD,volume 11))

Abstract

The term big data is now well understood for its well-defined characteristics. More the usage of big data is now looking promising. This chapter being an introduction draws a comprehensive picture on the progress of big data. First, it defines the big data characteristics and then presents on usage of big data in different domains. The challenges as well as guidelines in processing big data are outlined. A discussion on the state of art of hardware and software technologies required for big data processing is presented. The chapter has a brief discussion on the tools currently available for big data processing. Finally, research issues in big data are identified. The references surveyed for this chapter introducing different facets of this emergent area in data science provide a lead to intending readers for pursuing their interests in this subject.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Zikopoulos, P.C., Eaton, C., deRoos, D., Deutsch, T., Lapis, G.: Understanding Big Data. McGrawHill, New York, (2012)
Google Scholar
García, A.O., Bourov, S., Hammad, A., Hartmann, V., Jejkal, T., Otte, J.C., Pfeiffer, S., Schenker, T., Schmidt, C., Neuberger, P., Stotzka, R., van Wezel, J., Neumair, B., Streit, A.: Data-intensive analysis for scientific experiments at the large scale data facility. In: IEEE Symposium on Large Data Analysis and Visualization (LDAV), pp. 125–126 (2011)
Google Scholar
O’Leary, D.E.: Artificial intelligence and big data. Intell. Syst. IEEE 28, 96–99 (2013)
Article Google Scholar
Berman, J.J.: Introduction. In: Principles of Big Data, pp. xix-xxvi. Morgan Kaufmann, Boston (2013)
Google Scholar
Chen, M., Mao, S., Liu, Y.: Big data: a survey. Mob. Netw. Appl. 19, 171–209 (2014)
Google Scholar
Hashem, I.A.T., Yaqoob, I., Anuar, N.B., Mokhtar, S., Gani, A., Ullah, S.: The rise of “Big Data” on cloud computing: review and open research issues. Inf. Syst. 47, January, 98–115 (2015)
Google Scholar
Lusch, R.F., Liu, Y., Chen, Y.: The phase transition of markets and organizations: the new intelligence and entrepreneurial frontier. IEEE Intell. Syst. 25(1), 71–75 (2010)
Google Scholar
Chen, H., Chiang, R.H.L., Storey, V.C.: Business intelligence and analytics: from big data to big impact. MIS Quarterly 36(4), 1165–1188 (2012)
Google Scholar
Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 734-749 (2005)
Google Scholar
Chen, H.: Smart health and wellbeing. IEEE Intell. Syst. 26(5), 78–79 (2011)
Google Scholar
Parida, L., Haiminen, N., Haws, D., Suchodolski, J.: Host trait prediction of metagenomic data for topology-based visualisation. LNCS 5956, 134–149 (2015)
Google Scholar
Chen, H.: Dark Web: Exploring and Mining the Dark Side of the Web. Springer, New york (2012)
Book Google Scholar
NSF: Program Solicitation NSF 12-499: Core techniques and technologies for advancing big data science & engineering (BIGDATA). http://www.nsf.gov/pubs/2012/nsf12499/nsf12499.htm (2012). Accessed 12th Feb 2015
Salton, G.: Automatic Text Processing, Reading. Addison Wesley, MA (1989)
Google Scholar
Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge (1999)
MATH Google Scholar
Big Data Spectrum, Infosys. http://www.infosys.com/cloud/resource-center/Documents/big-data-spectrum.pdf
Short, E., Bohn, R.E., Baru, C.: How much information? 2010 report on enterprise server information. UCSD Global Information Industry Center (2011)
Google Scholar
http://public.web.cern.ch/public/en/LHC/Computing-en.html
http://www.youtube.com/yt/press/statistics.html
http://agbeat.com/tech-news/how-carriers-gather-track-and-sell-your-private-data/
http://www.information-management.com/issues/21_5/big-data-is-scaling-bi-and-analytics-10021093-1.html
Rahm, E., Do, H.H.: Data cleaning: problems and current approaches. IEEE Data Eng. Bull. 23, 3–13 (2000)
Google Scholar
Agrawal, D., Bernstein, P., Bertino, E., Davidson, S., Dayal, U., Franklin, M., Gehrke, J., Haas, L., Han, J., Halevy, A., Jagadish, H.V., Labrinidis, A., Madden, S., Papakon stantinou, Y., Patel, J., Ramakrishnan, R., Ross, K., Cyrus, S., Suciu, D., Vaithyanathan, S., Widom, J.: Challenges and opportunities with big data. CYBER CENTER TECHNICAL REPORTS, Purdue University (2011)
Google Scholar
Kasavajhala, V.: Solid state drive vs. hard disk drive price and performance study. In: Dell PowerVault Tech. Mark (2012)
Google Scholar
Hutchinson, L.: Solid-state revolution. In: Depth on how ssds really work. Ars Technica (2012)
Google Scholar
Pirovano, A., Lacaita, A.L., Benvenuti, A., Pellizzer, F., Hudgens, S., Bez, R.: Scaling analysis of phase-change memory technology. IEEE Int. Electron Dev. Meeting, 29.6.1–29.6.4 (2003)
Google Scholar
Chen, S., Gibbons, P.B., Nath, S.: Rethinking database algorithms for phase change memory. In: CIDR, pp. 21–31. www.crdrdb.org (2011)
Venkataraman, S., Tolia, N., Ranganathan, P., Campbell, R.H.: Consistent and durable data structures for non-volatile byte-addressable memory. In: Ganger, G.R., Wilkes, J. (eds.) FAST, pp. 61–75. USENIX (2011)
Google Scholar
Athanassoulis, M., Ailamaki, A., Chen, S., Gibbons, P., Stoica, R.: Flash in a DBMS: where and how? IEEE Data Eng. Bull. 33(4), 28–34 (2010)
Google Scholar
Condit, J., Nightingale, E.B., Frost, C., Ipek, E., Lee, B.C., Burger, D., Coetzee, D.: Better I/O through byte—addressable, persistent memory. In: Proceedings of the 22nd Symposium on Operating Systems Principles (22nd SOSP’09), Operating Systems Review (OSR), pp. 133–146, ACM SIGOPS, Big Sky, MT (2009)
Google Scholar
Wang, Q., Ren, K., Lou, W., Zhang, Y.: Dependable and secure sensor data storage with dynamic integrity assurance. In: Proceedings of the IEEE INFOCOM, pp. 954–962 (2009)
Google Scholar
Oprea, A., Reiter, M.K., Yang, K.: Space efficient block storage integrity. In: Proceeding of the 12th Annual Network and Distributed System Security Symposium (NDSS 05) (2005)
Google Scholar
Hashem, I.A.T., Yaqoob, I., Anuar, N.B., Mokhtar, S., Gani, A., Khan, S.U.: The rise of “big data” on cloud computing: review and open research issues, vol. 47, pp. 98–115 (2015)
Google Scholar
Wang, Q., Wang, C., Ren, K., Lou, W., Li, J.: Enabling public auditability and data dynamics for storage security in cloud computing. IEEE Trans. Parallel Distrib. Syst. 22(5), 847–859 (2011)
Article Google Scholar
Oehmen, C., Nieplocha, J.: Scalablast: a scalable implementation of blast for high-performance data-intensive bioinformatics analysis. IEEE Trans. Parallel Distrib. Syst. 17(8), 740–749 (2006)
Article Google Scholar
Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Hung Byers, A.: Big data: The Next Frontier for Innovation, Competition, and Productivity. McKinsey Global Institute (2012)
Google Scholar
Chen, C.L.P., Zhang, C.-Y.: Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf. Sci. 275, 314–347 (2014)
Google Scholar
Marz, N., Warren, J.: Big data: principles and best practices of scalable real-time data systems. Manning (2012)
Google Scholar
Garber, L.: Using in-memory analytics to quickly crunch big data. IEEE Comput. Soc. 45(10), 16–18 (2012)
Article Google Scholar
Molinari, C.: No one size fits all strategy for big data, Says IBM. http://www.bnamericas.com/news/technology/no-one-size-fits-all-strategy-for-big-data-says-ibm, October 2012
Ferguson, M.: Architecting a big data platform for analytics, Intelligent Business Strategies. https://www.ndm.net/datawarehouse/pdf/Netezza (2012). Accessed 19th Feb 2015
Ranganathan, P., Chang, J.: (Re)designing data-centric data centers. IEEE Micro 32(1), 66–70 (2012)
Article Google Scholar
Iyer, R., Illikkal, R., Zhao, L., Makineni, S., Newell, D., Moses, J., Apparao, P.: Datacenter-on-chip architectures: tera-scale opportunities and challenges. Intel Tech. J. 11(3), 227–238 (2007)
Article Google Scholar
Tang, J., Liu, S., Z, G., L, X.-F., Gaudiot, J.-L.: Achieving middleware execution efficiency: hardware-assisted garbage collection operations. J. Supercomput. 59(3), 1101–1119 (2012)
Article Google Scholar
Made in IBM labs: holey optochip first to transfer one trillion bits of information per second using the power of light, 2012. http://www-03.ibm.com/press/us/en/pressrelease/37095.wss
Farrington, N., Porter, G., Radhakrishnan, S., Bazzaz, H.H., Subramanya, V., Fainman, Y., Papen, G., Vahdat, A.: Helios: a hybrid electrical/optical switch architecture for modular data centers. In: Kalyanaraman, S., Padmanabhan, V.N., Ramakrishnan, K.K., Shorey, R., Voelker, G.M. (eds.) SIGCOMM, pp. 339–350. ACM (2010)
Google Scholar
Popek, G.J., Goldberg, R.P.: Formal requirements for virtualizable third generation architectures. Commun. ACM 17(7), 412–421 (1974)
Google Scholar
Andersen, R., Vinter, B.: The scientific byte code virtual machine. In: GCA, pp. 175–181 (2008)
Google Scholar
Kambatla, K., Kollias, G., Kumar, V., Grama, A.: Trends in big data analytics. J. Parallel Distrib. Comput. 74, 2561–2573 (2014)
Article Google Scholar
Brewer, E.A.: Towards robust distributed systems. In: Proceeding of 19th Annual ACM Symposium on Principles of Distributed Computing (PODC), pp. 7–10 (2000)
Google Scholar
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-value store. In: Proceedings of Twenty-First ACM SIGOPS Symposium on Operating Systems Principles, SOSP’07, ACM, New York, NY, USA, pp. 205–220 (2007)
Google Scholar
Lakshman, A., Malik, P.: Cassandra: a structured storage system on a p2p network. In: SPAA (2009)
Google Scholar
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: OSDI (2004)
Google Scholar
Apache yarn. http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html
Hortonworks blog. http://hortonworks.com/blog/executive-video-series-the-hortonworks-vision-for-apache-hadoop
Condie, T., Conway, N., Alvaro, P., Hellerstein, J.M., Elmeleegy, K., Sears, R.: MapReduce online. In: NSDI’10 Proceedings of the 7th USENIX conference on Networked systems design and implementation, p. 21
Google Scholar
Kambatla, K., Rapolu, N., Jagannathan, S., Grama, A.: Asynchronous algorithms in MapReduce. In: IEEE International Conference on Cluster Computing, CLUSTER (2010)
Google Scholar
Ranger, C., Raghuraman, R., Penmetsa, A., Bradski, G., Kozyrakis, C.: Evaluating mapreduce for multi-core and multiprocessor system. In: Proceedings of the 13th International Symposium on High-Performance Computer Architecture (HPCA), Phoenix, AZ (2007)
Google Scholar
Improving MapReduce Performance in Heterogeneous Environments. USENIX Association, San Diego, CA (2008), 12/2008
Google Scholar
Polato, I., Ré, R., Goldman, A., Kon, F.: A comprehensive view of Hadoop research—a systematic literature review. J. Netw. Comput. Appl. 46, 1–25 (2014)
Article Google Scholar
Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)
Article Google Scholar
Malewicz, G., Austern, M.H., Bik, A.J.C., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: SIGMOD (2010)
Google Scholar
Phoebus. https://github.com/xslogic/phoebus
Ahmad, Y., Berg, B., Cetintemel, U., Humphrey, M., Hwang, J.-H., Jhingran, A., Maskey, A., Papaemmanouil, O., Rasin, A., Tatbul, N., Xing, W., Xing, Y., Zdonik, S.: Distributed operation in the borealis stream processing engine. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, SIGMOD ‘05, pp. 882–884, ACM, New York, NY, USA (2005)
Google Scholar
Andrade, H., Gedik, B., Wu, K.L., Yu, P.S.: Processing high data rate streams in system S. J. Parallel Distrib. Comput. 71(2), 145–156 (2011)
Article Google Scholar
Power, R., Li, J.: Piccolo: building fast, distributed programs with partitioned tables. In: OSDI (2010)
Google Scholar
Rapolu, N., Kambatla, K., Jagannathan, S., Grama, A.: TransMR: data-centric programming beyond data parallelism. In: Proceedings of the 3rd USENIX Conference on Hot Topics in Cloud Computing, HotCloud’11, USENIX Association, Berkeley, CA, USA, pp. 19–19 (2011)
Google Scholar
Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D.: Dryad: distributed data-parallel programs from sequential building blocks. In: EuroSys ’07 Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems, vol. 41, no. 3, pp. 59–72 (2007)
Google Scholar
Wayner, P.: 7 top tools for taming big data. http://www.networkworld.com/reviews/2012/041812-7-top-tools-for-taming-258398.html (2012)
Pentaho Business Analytics. 2012. http://www.pentaho.com/explore/pentaho-business-analytics/
Diana Samuels, Skytree: machine learning meets big data. http://www.bizjournals.com/sanjose/blog/2012/02/skytree-machine-learning-meets-big-data.html?page=all, February 2012
Brooks, J.: Review: Talend open studio makes quick work of large data sets. http://www.eweek.com/c/a/Database/REVIEW-Talend-Open-Studio-Makes-Quick-ETL-Work-of-Large-Data-Sets-281473/ (2009)
Karmasphere Studio and Analyst. http://www.karmasphere.com/ (2012)
IBM Infosphere. http://www-01.ibm.com/software/in/data/infosphere/
Auradkar, A., Botev, C., Das, S., De Maagd, D., Feinberg, A., Ganti, P., Ghosh, B., Gao, L., Gopalakrishna, K., Harris, B., Koshy, J., Krawez, K., Kreps, J., Lu, S., Nagaraj, S., Narkhede, N., Pachev, S., Perisic, I., Qiao, L., Quiggle, T., Rao, J., Schulman, B., Sebastian, A., Seeliger, O., Silberstein, A., Shkolnik, B., Soman, C., Sumbaly, R., Surlaker, K., Topiwala, S., Tran, C., Varadarajan, B., Westerman, J., White, Z., Zhang, D., Zhang, J.: Data infrastructure at linkedin. In: 2012 IEEE 28th International Conference on Data Engineering (ICDE), pp. 1370–1381 (2012)
Google Scholar
Kraft, S., Casale, G., Jula, A., Kilpatrick, P., Greer, D.: Wiq: work-intensive query scheduling for in-memory database systems. In: 2012 IEEE 5th International Conference on Cloud Computing (CLOUD), pp. 33–40 (2012)
Google Scholar
Samson, T.: Splunk storm brings log management to the cloud. http://www.infoworld.com/t/managed-services/splunk-storm-brings-log-management-the-cloud-201098?source=footer (2012)
Storm. http://storm-project.net/ (2012)
Sqlstream. http://www.sqlstream.com/products/server/ (2012)
Neumeyer, L., Robbins, B., Nair, A., Kesari, A.: S4: distributed stream computing platform. In: 2010 IEEE Data Mining Workshops (ICDMW), pp. 170–177, Sydney, Australia (2010)
Google Scholar
Kelly, J.: Apache drill brings SQL-like, ad hoc query capabilities to big data. http://wikibon.org/wiki/v/Apache-Drill-Brings-SQL-Like-Ad-Hoc-Query-Capabilities-to-Big-Data, February 2013
Melnik, S., Gubarev, A., Long, J.J., Romer, G., Shivakumar, S., Tolton, M., Vassilakis, T.: Dremel: interactive analysis of webscale datasets. In: Proceedings of the 36th International Conference on Very Large Data Bases (2010), vol. 3(1), pp. 330–339 (2010)
Google Scholar
Li, X., Yao, X.: Cooperatively coevolving particle swarms for large scale optimization. IEEE Trans. Evol. Comput. 16(2), 210–224 (2008)
Google Scholar
Yang, Z., Tang, K., Yao, X.: Large scale evolutionary optimization using cooperative coevolution. Inf. Sci. 178(15), 2985–2999 (2008)
Article MATH MathSciNet Google Scholar
Yan, J., Liu, N., Yan, S., Yang, Q., Fan, W., Wei, W., Chen, Z.: Trace-oriented feature analysis for large-scale text data dimension reduction. IEEE Trans. Knowl. Data Eng. 23(7), 1103–1117 (2011)
Article Google Scholar
Spiliopoulou, M., Hatzopoulos, M., Cotronis, Y.: Parallel optimization of large join queries with set operators and aggregates in a parallel environment supporting pipeline. IEEE Trans. Knowl. Data Eng. 8(3), 429–445 (1996)
Article Google Scholar
Di Ciaccio, A., Coli, M., Ibanez, A., Miguel, J.: Advanced Statistical Methods for the Analysis of Large Data-Sets. Springer, Berlin (2012)
Book MATH Google Scholar
Pébay, P., Thompson, D., Bennett, J., Mascarenhas, A.: Design and performance of a scalable, parallel statistics toolkit. In: 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), pp. 1475–1484 (2011)
Google Scholar
Klemens, B.: Modeling with Data: Tools and Techniques for Statistical Computing. Princeton University Press, New Jersey (2008)
Google Scholar
Wilkinson, L.: The future of statistical computing. Technometrics 50(4), 418–435 (2008)
Article MathSciNet Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining Inference and Prediction, 2nd edn. Springer, Berlin (2009). (egy, Russell Sears, MapReduce online. In: NSDI, 2009)
Google Scholar
Jamali, M., Abolhassani, H.: Different aspects of social network analysis. In: IEEE/WIC/ACM International Conference on Web Intelligence, WI 2006, pp. 66–72 (2006)
Google Scholar
Zhang, Yu., van der Schaar, M.: Information production and link formation in social computing systems. IEEE J. Sel. Areas Commun. 30(1), 2136–2145 (2012)
Article Google Scholar
Bringmann, B., Berlingerio, M., Bonchi, F., Gionis, A.: Learning and predicting the evolution of social networks. IEEE Intell. Syst. 25(4), 26–35 (2010)
Article Google Scholar
Fekete, J.-D., Henry, N., McGuffin, M.: Nodetrix: a hybrid visualization of social network. IEEE Trans. Visual. Comput. Graph. 13(6), 1302–1309 (2007)
Article Google Scholar
Shen, Z., Ma, K.-L., Eliassi-Rad, T.: Visual analysis of large heterogeneous social networks by semantic and structural abstraction. IEEE Trans. Visual. Comput. Graph. 12(6), 1427–1439 (2006)
Article Google Scholar
Lin, C.-Y., Lynn, W., Wen, Z., Tong, H., Griffiths-Fisher, V., Shi, L., Lubensky, D.: Social network analysis in enterprise. Proc. IEEE 100(9), 2759–2776 (2012)
Article Google Scholar
Ma, H., King, I., Lyu, M.R.-T.: Mining web graphs for recommendations. IEEE Trans. Knowl. Data Eng. 24(12), 1051–1064 (2012)
Google Scholar
Lane, N.D., Ye, X., Hong, L., Campbell, A.T., Choudhury, T., Eisenman, S.B.: Exploiting social networks for large-scale human behavior modeling. IEEE Pervasive Comput. 10(4), 45–53 (2011)
Article Google Scholar
Bengio, Y.: Learning deep architectures for ai, Found. Trends Mach. Learn. 2(1),1–1-1–27 (2009)
Google Scholar
Seiffert, U.: Training of large-scale feed-forward neural networks. In: International Joint Conference on Neural Networks, IJCNN ‘06, pp. 5324–5329 (2006)
Google Scholar
Arel, I., Rose, D.C., Karnowski, T.P.: Deep machine learning—a new frontier in artificial intelligence research. IEEE Comput. Intell. Mag. 5(4), 13–18 (2010)
Article Google Scholar
Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
Article Google Scholar
Le, Q.V., Ranzato, M.A., Monga, R., Devin, M., Chen, K., Corrado, G.S., Dean, J., Andrew, Y. N.: Building high-level features using large scale unsupervised learning. In: Proceedings of the 29th International Conference on Machine Learning (2012)
Google Scholar
Dong, Y., Deng, L.: Deep learning and its applications to signal and information processing. IEEE Signal Process. Mag. 28(1), 145–154 (2011)
Article Google Scholar
Ciresan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)
Google Scholar
Simoff, S., Böhlen, M.H., Mazeika, A.: Visual Data Mining: Theory, Techniques and Tools for Visual Analytics. Springer, Berlin (2008)
Google Scholar
Thompson, D., Levine, J.A., Bennett, J.C., Bremer, P.T., Gyulassy, A., Pascucci, V., Pébay, P.P.: Analysis of large-scale scalar data using hixels. In: 2011 IEEE Symposium on Large Data Analysis and Visualization (LDAV), pp. 23–30 (2011)
Google Scholar
Andrzej, W.P., Kreinovich, V.: Handbook of Granular Computing. Wiley, New York (2008)
Google Scholar
Peters, G.: Granular box regression. IEEE Trans. Fuzzy Syst. 19(6), 1141–1151 (2011)
Article Google Scholar
Su, S.-F., Chuang, C.-C., Tao, C.W., Jeng, J.-T., Hsiao, C.-C.: Radial basis function networks with linear interval regression weights for symbolic interval data. IEEE Trans. Syst. Man Cyber.–Part B Cyber. 19(6), 1141–1151 (2011)
Google Scholar
Simon, D.R.: On the power of quantum computation. SIAM J. Comput. 26, 116–123 (1994)
Google Scholar
Lloyd, S.P.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)
Article MATH MathSciNet Google Scholar
Nielsen, M.A., Chuang, I.L.: Quantum Computation and Quantum Information. Cambridge University Press, Cambridge (2009)
Google Scholar
Furht, B., Escalante, A.: Handbook of Cloud Computing. Springer, Berlin (2011)
Google Scholar
Schadt, E.E., Linderman, M.D., Sorenson, J., Lee, L., Nolan, G.P.: Computational solutions to large-scale data management and analysis. Nat. Rev. Genet. 11(9), 647–657 (2010)
Article Google Scholar
Sipper, M., Sanchez, E., Mange, D., Tomassini, M., Pérez-Uribe, A., Stauffer, A.: A phylogenetic, ontogenetic, and epigenetic view of bio-inspired hardware systems. IEEE Trans. Evol. Comput. 1(1), 83–97 (1997)
Article Google Scholar
Bongard, J.: Biologically inspired computing. Computer 42(4), 95–98 (2009)
Article Google Scholar
Ratner, M., Ratner, D.: Nanotechnology: A Gentle Introduction to the Next Big Idea, 1st edn. Prentice Hall Press, Upper Saddle River (2002)
Google Scholar
Weiss, R., Basu, S., Hooshangi, S., Kalmbach, A., Karig, D., Mehreja, R., Netravali, I.: Genetic circuit building blocks for cellular computation, communications, and signal processing. Nat. Comput. 2, 47–84 (2003)
Article Google Scholar
Wang, L., Shen, J.: Towards bio-inspired cost minimisation for data-intensive service provision. In: 2012 IEEE First International Conference on Services Economics (SE), pp. 16–23 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer and Information Sciences, University of Hyderabad, Gachhibowli, 500046, Hyderabad, India
Hrushikesha Mohanty

Authors

Hrushikesha Mohanty
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hrushikesha Mohanty .

Editor information

Editors and Affiliations

School of Computer and Information Scien, University of Hyderabad, Hyderabad, Andhra Pradesh, India
Hrushikesha Mohanty
School of Computer Engineering, KIIT University, Bhubaneshwar, Odisha, India
Prachet Bhuyan
Teradata India Private Limited, Hyderabad, Andhra Pradesh, India
Deepak Chenthati

Exercise

1.
Define big data. Explain with an example.
2.
List the possible sources generating big data.
3.
Discuss on usage of big data in different domains?
4.
Why is it called “big data a Service”? Justify your answer.
5.
What makes big data processing difficult?
6.
Discuss on the guidelines for big data processing.
7.
Draw an ecosystem for a big data system. Explain functionality of each component.
8.
Discuss on hardware and software technology required for big data processing.
9.
Make a list of big data tools and note their functionality
10.
Discuss on trends in big data research.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mohanty, H. (2015). Big Data: An Introduction. In: Mohanty, H., Bhuyan, P., Chenthati, D. (eds) Big Data. Studies in Big Data, vol 11. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2494-5_1

Download citation

DOI: https://doi.org/10.1007/978-81-322-2494-5_1
Published: 28 June 2015
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2493-8
Online ISBN: 978-81-322-2494-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Abstract

Buying options

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Exercise

Exercise

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation