Skip to main content
Book cover

Big Data pp 1–28Cite as

Big Data: An Introduction

  • Chapter
  • First Online:

Part of the book series: Studies in Big Data ((SBD,volume 11))

Abstract

The term big data is now well understood for its well-defined characteristics. More the usage of big data is now looking promising. This chapter being an introduction draws a comprehensive picture on the progress of big data. First, it defines the big data characteristics and then presents on usage of big data in different domains. The challenges as well as guidelines in processing big data are outlined. A discussion on the state of art of hardware and software technologies required for big data processing is presented. The chapter has a brief discussion on the tools currently available for big data processing. Finally, research issues in big data are identified. The references surveyed for this chapter introducing different facets of this emergent area in data science provide a lead to intending readers for pursuing their interests in this subject.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Zikopoulos, P.C., Eaton, C., deRoos, D., Deutsch, T., Lapis, G.: Understanding Big Data. McGrawHill, New York, (2012)

    Google Scholar 

  2. García, A.O., Bourov, S., Hammad, A., Hartmann, V., Jejkal, T., Otte, J.C., Pfeiffer, S., Schenker, T., Schmidt, C., Neuberger, P., Stotzka, R., van Wezel, J., Neumair, B., Streit, A.: Data-intensive analysis for scientific experiments at the large scale data facility. In: IEEE Symposium on Large Data Analysis and Visualization (LDAV), pp. 125–126 (2011)

    Google Scholar 

  3. O’Leary, D.E.: Artificial intelligence and big data. Intell. Syst. IEEE 28, 96–99 (2013)

    Article  Google Scholar 

  4. Berman, J.J.: Introduction. In: Principles of Big Data, pp. xix-xxvi. Morgan Kaufmann, Boston (2013)

    Google Scholar 

  5. Chen, M., Mao, S., Liu, Y.: Big data: a survey. Mob. Netw. Appl. 19, 171–209 (2014)

    Google Scholar 

  6. Hashem, I.A.T., Yaqoob, I., Anuar, N.B., Mokhtar, S., Gani, A., Ullah, S.: The rise of “Big Data” on cloud computing: review and open research issues. Inf. Syst. 47, January, 98–115 (2015)

    Google Scholar 

  7. Lusch, R.F., Liu, Y., Chen, Y.: The phase transition of markets and organizations: the new intelligence and entrepreneurial frontier. IEEE Intell. Syst. 25(1), 71–75 (2010)

    Google Scholar 

  8. Chen, H., Chiang, R.H.L., Storey, V.C.: Business intelligence and analytics: from big data to big impact. MIS Quarterly 36(4), 1165–1188 (2012)

    Google Scholar 

  9. Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 734-749 (2005)

    Google Scholar 

  10. Chen, H.: Smart health and wellbeing. IEEE Intell. Syst. 26(5), 78–79 (2011)

    Google Scholar 

  11. Parida, L., Haiminen, N., Haws, D., Suchodolski, J.: Host trait prediction of metagenomic data for topology-based visualisation. LNCS 5956, 134–149 (2015)

    Google Scholar 

  12. Chen, H.: Dark Web: Exploring and Mining the Dark Side of the Web. Springer, New york (2012)

    Book  Google Scholar 

  13. NSF: Program Solicitation NSF 12-499: Core techniques and technologies for advancing big data science & engineering (BIGDATA). http://www.nsf.gov/pubs/2012/nsf12499/nsf12499.htm (2012). Accessed 12th Feb 2015

  14. Salton, G.: Automatic Text Processing, Reading. Addison Wesley, MA (1989)

    Google Scholar 

  15. Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge (1999)

    MATH  Google Scholar 

  16. Big Data Spectrum, Infosys. http://www.infosys.com/cloud/resource-center/Documents/big-data-spectrum.pdf

  17. Short, E., Bohn, R.E., Baru, C.: How much information? 2010 report on enterprise server information. UCSD Global Information Industry Center (2011)

    Google Scholar 

  18. http://public.web.cern.ch/public/en/LHC/Computing-en.html

  19. http://www.youtube.com/yt/press/statistics.html

  20. http://agbeat.com/tech-news/how-carriers-gather-track-and-sell-your-private-data/

  21. http://www.information-management.com/issues/21_5/big-data-is-scaling-bi-and-analytics-10021093-1.html

  22. Rahm, E., Do, H.H.: Data cleaning: problems and current approaches. IEEE Data Eng. Bull. 23, 3–13 (2000)

    Google Scholar 

  23. Agrawal, D., Bernstein, P., Bertino, E., Davidson, S., Dayal, U., Franklin, M., Gehrke, J., Haas, L., Han, J., Halevy, A., Jagadish, H.V., Labrinidis, A., Madden, S., Papakon stantinou, Y., Patel, J., Ramakrishnan, R., Ross, K., Cyrus, S., Suciu, D., Vaithyanathan, S., Widom, J.: Challenges and opportunities with big data. CYBER CENTER TECHNICAL REPORTS, Purdue University (2011)

    Google Scholar 

  24. Kasavajhala, V.: Solid state drive vs. hard disk drive price and performance study. In: Dell PowerVault Tech. Mark (2012)

    Google Scholar 

  25. Hutchinson, L.: Solid-state revolution. In: Depth on how ssds really work. Ars Technica (2012)

    Google Scholar 

  26. Pirovano, A., Lacaita, A.L., Benvenuti, A., Pellizzer, F., Hudgens, S., Bez, R.: Scaling analysis of phase-change memory technology. IEEE Int. Electron Dev. Meeting, 29.6.1–29.6.4 (2003)

    Google Scholar 

  27. Chen, S., Gibbons, P.B., Nath, S.: Rethinking database algorithms for phase change memory. In: CIDR, pp. 21–31. www.crdrdb.org (2011)

  28. Venkataraman, S., Tolia, N., Ranganathan, P., Campbell, R.H.: Consistent and durable data structures for non-volatile byte-addressable memory. In: Ganger, G.R., Wilkes, J. (eds.) FAST, pp. 61–75. USENIX (2011)

    Google Scholar 

  29. Athanassoulis, M., Ailamaki, A., Chen, S., Gibbons, P., Stoica, R.: Flash in a DBMS: where and how? IEEE Data Eng. Bull. 33(4), 28–34 (2010)

    Google Scholar 

  30. Condit, J., Nightingale, E.B., Frost, C., Ipek, E., Lee, B.C., Burger, D., Coetzee, D.: Better I/O through byte—addressable, persistent memory. In: Proceedings of the 22nd Symposium on Operating Systems Principles (22nd SOSP’09), Operating Systems Review (OSR), pp. 133–146, ACM SIGOPS, Big Sky, MT (2009)

    Google Scholar 

  31. Wang, Q., Ren, K., Lou, W., Zhang, Y.: Dependable and secure sensor data storage with dynamic integrity assurance. In: Proceedings of the IEEE INFOCOM, pp. 954–962 (2009)

    Google Scholar 

  32. Oprea, A., Reiter, M.K., Yang, K.: Space efficient block storage integrity. In: Proceeding of the 12th Annual Network and Distributed System Security Symposium (NDSS 05) (2005)

    Google Scholar 

  33. Hashem, I.A.T., Yaqoob, I., Anuar, N.B., Mokhtar, S., Gani, A., Khan, S.U.: The rise of “big data” on cloud computing: review and open research issues, vol. 47, pp. 98–115 (2015)

    Google Scholar 

  34. Wang, Q., Wang, C., Ren, K., Lou, W., Li, J.: Enabling public auditability and data dynamics for storage security in cloud computing. IEEE Trans. Parallel Distrib. Syst. 22(5), 847–859 (2011)

    Article  Google Scholar 

  35. Oehmen, C., Nieplocha, J.: Scalablast: a scalable implementation of blast for high-performance data-intensive bioinformatics analysis. IEEE Trans. Parallel Distrib. Syst. 17(8), 740–749 (2006)

    Article  Google Scholar 

  36. Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Hung Byers, A.: Big data: The Next Frontier for Innovation, Competition, and Productivity. McKinsey Global Institute (2012)

    Google Scholar 

  37. Chen, C.L.P., Zhang, C.-Y.: Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf. Sci. 275, 314–347 (2014)

    Google Scholar 

  38. Marz, N., Warren, J.: Big data: principles and best practices of scalable real-time data systems. Manning (2012)

    Google Scholar 

  39. Garber, L.: Using in-memory analytics to quickly crunch big data. IEEE Comput. Soc. 45(10), 16–18 (2012)

    Article  Google Scholar 

  40. Molinari, C.: No one size fits all strategy for big data, Says IBM. http://www.bnamericas.com/news/technology/no-one-size-fits-all-strategy-for-big-data-says-ibm, October 2012

  41. Ferguson, M.: Architecting a big data platform for analytics, Intelligent Business Strategies. https://www.ndm.net/datawarehouse/pdf/Netezza (2012). Accessed 19th Feb 2015

  42. Ranganathan, P., Chang, J.: (Re)designing data-centric data centers. IEEE Micro 32(1), 66–70 (2012)

    Article  Google Scholar 

  43. Iyer, R., Illikkal, R., Zhao, L., Makineni, S., Newell, D., Moses, J., Apparao, P.: Datacenter-on-chip architectures: tera-scale opportunities and challenges. Intel Tech. J. 11(3), 227–238 (2007)

    Article  Google Scholar 

  44. Tang, J., Liu, S., Z, G., L, X.-F., Gaudiot, J.-L.: Achieving middleware execution efficiency: hardware-assisted garbage collection operations. J. Supercomput. 59(3), 1101–1119 (2012)

    Article  Google Scholar 

  45. Made in IBM labs: holey optochip first to transfer one trillion bits of information per second using the power of light, 2012. http://www-03.ibm.com/press/us/en/pressrelease/37095.wss

  46. Farrington, N., Porter, G., Radhakrishnan, S., Bazzaz, H.H., Subramanya, V., Fainman, Y., Papen, G., Vahdat, A.: Helios: a hybrid electrical/optical switch architecture for modular data centers. In: Kalyanaraman, S., Padmanabhan, V.N., Ramakrishnan, K.K., Shorey, R., Voelker, G.M. (eds.) SIGCOMM, pp. 339–350. ACM (2010)

    Google Scholar 

  47. Popek, G.J., Goldberg, R.P.: Formal requirements for virtualizable third generation architectures. Commun. ACM 17(7), 412–421 (1974)

    Google Scholar 

  48. Andersen, R., Vinter, B.: The scientific byte code virtual machine. In: GCA, pp. 175–181 (2008)

    Google Scholar 

  49. Kambatla, K., Kollias, G., Kumar, V., Grama, A.: Trends in big data analytics. J. Parallel Distrib. Comput. 74, 2561–2573 (2014)

    Article  Google Scholar 

  50. Brewer, E.A.: Towards robust distributed systems. In: Proceeding of 19th Annual ACM Symposium on Principles of Distributed Computing (PODC), pp. 7–10 (2000)

    Google Scholar 

  51. DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-value store. In: Proceedings of Twenty-First ACM SIGOPS Symposium on Operating Systems Principles, SOSP’07, ACM, New York, NY, USA, pp. 205–220 (2007)

    Google Scholar 

  52. Lakshman, A., Malik, P.: Cassandra: a structured storage system on a p2p network. In: SPAA (2009)

    Google Scholar 

  53. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: OSDI (2004)

    Google Scholar 

  54. Apache yarn. http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html

  55. Hortonworks blog. http://hortonworks.com/blog/executive-video-series-the-hortonworks-vision-for-apache-hadoop

  56. Condie, T., Conway, N., Alvaro, P., Hellerstein, J.M., Elmeleegy, K., Sears, R.: MapReduce online. In: NSDI’10 Proceedings of the 7th USENIX conference on Networked systems design and implementation, p. 21

    Google Scholar 

  57. Kambatla, K., Rapolu, N., Jagannathan, S., Grama, A.: Asynchronous algorithms in MapReduce. In: IEEE International Conference on Cluster Computing, CLUSTER (2010)

    Google Scholar 

  58. Ranger, C., Raghuraman, R., Penmetsa, A., Bradski, G., Kozyrakis, C.: Evaluating mapreduce for multi-core and multiprocessor system. In: Proceedings of the 13th International Symposium on High-Performance Computer Architecture (HPCA), Phoenix, AZ (2007)

    Google Scholar 

  59. Improving MapReduce Performance in Heterogeneous Environments. USENIX Association, San Diego, CA (2008), 12/2008

    Google Scholar 

  60. Polato, I., Ré, R., Goldman, A., Kon, F.: A comprehensive view of Hadoop research—a systematic literature review. J. Netw. Comput. Appl. 46, 1–25 (2014)

    Article  Google Scholar 

  61. Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)

    Article  Google Scholar 

  62. Malewicz, G., Austern, M.H., Bik, A.J.C., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: SIGMOD (2010)

    Google Scholar 

  63. Phoebus. https://github.com/xslogic/phoebus

  64. Ahmad, Y., Berg, B., Cetintemel, U., Humphrey, M., Hwang, J.-H., Jhingran, A., Maskey, A., Papaemmanouil, O., Rasin, A., Tatbul, N., Xing, W., Xing, Y., Zdonik, S.: Distributed operation in the borealis stream processing engine. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, SIGMOD ‘05, pp. 882–884, ACM, New York, NY, USA (2005)

    Google Scholar 

  65. Andrade, H., Gedik, B., Wu, K.L., Yu, P.S.: Processing high data rate streams in system S. J. Parallel Distrib. Comput. 71(2), 145–156 (2011)

    Article  Google Scholar 

  66. Power, R., Li, J.: Piccolo: building fast, distributed programs with partitioned tables. In: OSDI (2010)

    Google Scholar 

  67. Rapolu, N., Kambatla, K., Jagannathan, S., Grama, A.: TransMR: data-centric programming beyond data parallelism. In: Proceedings of the 3rd USENIX Conference on Hot Topics in Cloud Computing, HotCloud’11, USENIX Association, Berkeley, CA, USA, pp. 19–19 (2011)

    Google Scholar 

  68. Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D.: Dryad: distributed data-parallel programs from sequential building blocks. In: EuroSys ’07 Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems, vol. 41, no. 3, pp. 59–72 (2007)

    Google Scholar 

  69. Wayner, P.: 7 top tools for taming big data. http://www.networkworld.com/reviews/2012/041812-7-top-tools-for-taming-258398.html (2012)

  70. Pentaho Business Analytics. 2012. http://www.pentaho.com/explore/pentaho-business-analytics/

  71. Diana Samuels, Skytree: machine learning meets big data. http://www.bizjournals.com/sanjose/blog/2012/02/skytree-machine-learning-meets-big-data.html?page=all, February 2012

  72. Brooks, J.: Review: Talend open studio makes quick work of large data sets. http://www.eweek.com/c/a/Database/REVIEW-Talend-Open-Studio-Makes-Quick-ETL-Work-of-Large-Data-Sets-281473/ (2009)

  73. Karmasphere Studio and Analyst. http://www.karmasphere.com/ (2012)

  74. IBM Infosphere. http://www-01.ibm.com/software/in/data/infosphere/

  75. Auradkar, A., Botev, C., Das, S., De Maagd, D., Feinberg, A., Ganti, P., Ghosh, B., Gao, L., Gopalakrishna, K., Harris, B., Koshy, J., Krawez, K., Kreps, J., Lu, S., Nagaraj, S., Narkhede, N., Pachev, S., Perisic, I., Qiao, L., Quiggle, T., Rao, J., Schulman, B., Sebastian, A., Seeliger, O., Silberstein, A., Shkolnik, B., Soman, C., Sumbaly, R., Surlaker, K., Topiwala, S., Tran, C., Varadarajan, B., Westerman, J., White, Z., Zhang, D., Zhang, J.: Data infrastructure at linkedin. In: 2012 IEEE 28th International Conference on Data Engineering (ICDE), pp. 1370–1381 (2012)

    Google Scholar 

  76. Kraft, S., Casale, G., Jula, A., Kilpatrick, P., Greer, D.: Wiq: work-intensive query scheduling for in-memory database systems. In: 2012 IEEE 5th International Conference on Cloud Computing (CLOUD), pp. 33–40 (2012)

    Google Scholar 

  77. Samson, T.: Splunk storm brings log management to the cloud. http://www.infoworld.com/t/managed-services/splunk-storm-brings-log-management-the-cloud-201098?source=footer (2012)

  78. Storm. http://storm-project.net/ (2012)

  79. Sqlstream. http://www.sqlstream.com/products/server/ (2012)

  80. Neumeyer, L., Robbins, B., Nair, A., Kesari, A.: S4: distributed stream computing platform. In: 2010 IEEE Data Mining Workshops (ICDMW), pp. 170–177, Sydney, Australia (2010)

    Google Scholar 

  81. Kelly, J.: Apache drill brings SQL-like, ad hoc query capabilities to big data. http://wikibon.org/wiki/v/Apache-Drill-Brings-SQL-Like-Ad-Hoc-Query-Capabilities-to-Big-Data, February 2013

  82. Melnik, S., Gubarev, A., Long, J.J., Romer, G., Shivakumar, S., Tolton, M., Vassilakis, T.: Dremel: interactive analysis of webscale datasets. In: Proceedings of the 36th International Conference on Very Large Data Bases (2010), vol. 3(1), pp. 330–339 (2010)

    Google Scholar 

  83. Li, X., Yao, X.: Cooperatively coevolving particle swarms for large scale optimization. IEEE Trans. Evol. Comput. 16(2), 210–224 (2008)

    Google Scholar 

  84. Yang, Z., Tang, K., Yao, X.: Large scale evolutionary optimization using cooperative coevolution. Inf. Sci. 178(15), 2985–2999 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  85. Yan, J., Liu, N., Yan, S., Yang, Q., Fan, W., Wei, W., Chen, Z.: Trace-oriented feature analysis for large-scale text data dimension reduction. IEEE Trans. Knowl. Data Eng. 23(7), 1103–1117 (2011)

    Article  Google Scholar 

  86. Spiliopoulou, M., Hatzopoulos, M., Cotronis, Y.: Parallel optimization of large join queries with set operators and aggregates in a parallel environment supporting pipeline. IEEE Trans. Knowl. Data Eng. 8(3), 429–445 (1996)

    Article  Google Scholar 

  87. Di Ciaccio, A., Coli, M., Ibanez, A., Miguel, J.: Advanced Statistical Methods for the Analysis of Large Data-Sets. Springer, Berlin (2012)

    Book  MATH  Google Scholar 

  88. Pébay, P., Thompson, D., Bennett, J., Mascarenhas, A.: Design and performance of a scalable, parallel statistics toolkit. In: 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), pp. 1475–1484 (2011)

    Google Scholar 

  89. Klemens, B.: Modeling with Data: Tools and Techniques for Statistical Computing. Princeton University Press, New Jersey (2008)

    Google Scholar 

  90. Wilkinson, L.: The future of statistical computing. Technometrics 50(4), 418–435 (2008)

    Article  MathSciNet  Google Scholar 

  91. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining Inference and Prediction, 2nd edn. Springer, Berlin (2009). (egy, Russell Sears, MapReduce online. In: NSDI, 2009)

    Google Scholar 

  92. Jamali, M., Abolhassani, H.: Different aspects of social network analysis. In: IEEE/WIC/ACM International Conference on Web Intelligence, WI 2006, pp. 66–72 (2006)

    Google Scholar 

  93. Zhang, Yu., van der Schaar, M.: Information production and link formation in social computing systems. IEEE J. Sel. Areas Commun. 30(1), 2136–2145 (2012)

    Article  Google Scholar 

  94. Bringmann, B., Berlingerio, M., Bonchi, F., Gionis, A.: Learning and predicting the evolution of social networks. IEEE Intell. Syst. 25(4), 26–35 (2010)

    Article  Google Scholar 

  95. Fekete, J.-D., Henry, N., McGuffin, M.: Nodetrix: a hybrid visualization of social network. IEEE Trans. Visual. Comput. Graph. 13(6), 1302–1309 (2007)

    Article  Google Scholar 

  96. Shen, Z., Ma, K.-L., Eliassi-Rad, T.: Visual analysis of large heterogeneous social networks by semantic and structural abstraction. IEEE Trans. Visual. Comput. Graph. 12(6), 1427–1439 (2006)

    Article  Google Scholar 

  97. Lin, C.-Y., Lynn, W., Wen, Z., Tong, H., Griffiths-Fisher, V., Shi, L., Lubensky, D.: Social network analysis in enterprise. Proc. IEEE 100(9), 2759–2776 (2012)

    Article  Google Scholar 

  98. Ma, H., King, I., Lyu, M.R.-T.: Mining web graphs for recommendations. IEEE Trans. Knowl. Data Eng. 24(12), 1051–1064 (2012)

    Google Scholar 

  99. Lane, N.D., Ye, X., Hong, L., Campbell, A.T., Choudhury, T., Eisenman, S.B.: Exploiting social networks for large-scale human behavior modeling. IEEE Pervasive Comput. 10(4), 45–53 (2011)

    Article  Google Scholar 

  100. Bengio, Y.: Learning deep architectures for ai, Found. Trends Mach. Learn. 2(1),1–1-1–27 (2009)

    Google Scholar 

  101. Seiffert, U.: Training of large-scale feed-forward neural networks. In: International Joint Conference on Neural Networks, IJCNN ‘06, pp. 5324–5329 (2006)

    Google Scholar 

  102. Arel, I., Rose, D.C., Karnowski, T.P.: Deep machine learning—a new frontier in artificial intelligence research. IEEE Comput. Intell. Mag. 5(4), 13–18 (2010)

    Article  Google Scholar 

  103. Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)

    Article  Google Scholar 

  104. Le, Q.V., Ranzato, M.A., Monga, R., Devin, M., Chen, K., Corrado, G.S., Dean, J., Andrew, Y. N.: Building high-level features using large scale unsupervised learning. In: Proceedings of the 29th International Conference on Machine Learning (2012)

    Google Scholar 

  105. Dong, Y., Deng, L.: Deep learning and its applications to signal and information processing. IEEE Signal Process. Mag. 28(1), 145–154 (2011)

    Article  Google Scholar 

  106. Ciresan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)

    Google Scholar 

  107. Simoff, S., Böhlen, M.H., Mazeika, A.: Visual Data Mining: Theory, Techniques and Tools for Visual Analytics. Springer, Berlin (2008)

    Google Scholar 

  108. Thompson, D., Levine, J.A., Bennett, J.C., Bremer, P.T., Gyulassy, A., Pascucci, V., Pébay, P.P.: Analysis of large-scale scalar data using hixels. In: 2011 IEEE Symposium on Large Data Analysis and Visualization (LDAV), pp. 23–30 (2011)

    Google Scholar 

  109. Andrzej, W.P., Kreinovich, V.: Handbook of Granular Computing. Wiley, New York (2008)

    Google Scholar 

  110. Peters, G.: Granular box regression. IEEE Trans. Fuzzy Syst. 19(6), 1141–1151 (2011)

    Article  Google Scholar 

  111. Su, S.-F., Chuang, C.-C., Tao, C.W., Jeng, J.-T., Hsiao, C.-C.: Radial basis function networks with linear interval regression weights for symbolic interval data. IEEE Trans. Syst. Man Cyber.–Part B Cyber. 19(6), 1141–1151 (2011)

    Google Scholar 

  112. Simon, D.R.: On the power of quantum computation. SIAM J. Comput. 26, 116–123 (1994)

    Google Scholar 

  113. Lloyd, S.P.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)

    Article  MATH  MathSciNet  Google Scholar 

  114. Nielsen, M.A., Chuang, I.L.: Quantum Computation and Quantum Information. Cambridge University Press, Cambridge (2009)

    Google Scholar 

  115. Furht, B., Escalante, A.: Handbook of Cloud Computing. Springer, Berlin (2011)

    Google Scholar 

  116. Schadt, E.E., Linderman, M.D., Sorenson, J., Lee, L., Nolan, G.P.: Computational solutions to large-scale data management and analysis. Nat. Rev. Genet. 11(9), 647–657 (2010)

    Article  Google Scholar 

  117. Sipper, M., Sanchez, E., Mange, D., Tomassini, M., Pérez-Uribe, A., Stauffer, A.: A phylogenetic, ontogenetic, and epigenetic view of bio-inspired hardware systems. IEEE Trans. Evol. Comput. 1(1), 83–97 (1997)

    Article  Google Scholar 

  118. Bongard, J.: Biologically inspired computing. Computer 42(4), 95–98 (2009)

    Article  Google Scholar 

  119. Ratner, M., Ratner, D.: Nanotechnology: A Gentle Introduction to the Next Big Idea, 1st edn. Prentice Hall Press, Upper Saddle River (2002)

    Google Scholar 

  120. Weiss, R., Basu, S., Hooshangi, S., Kalmbach, A., Karig, D., Mehreja, R., Netravali, I.: Genetic circuit building blocks for cellular computation, communications, and signal processing. Nat. Comput. 2, 47–84 (2003)

    Article  Google Scholar 

  121. Wang, L., Shen, J.: Towards bio-inspired cost minimisation for data-intensive service provision. In: 2012 IEEE First International Conference on Services Economics (SE), pp. 16–23 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hrushikesha Mohanty .

Editor information

Editors and Affiliations

Exercise

Exercise

  1. 1.

    Define big data. Explain with an example.

  2. 2.

    List the possible sources generating big data.

  3. 3.

    Discuss on usage of big data in different domains?

  4. 4.

    Why is it called “big data a Service”? Justify your answer.

  5. 5.

    What makes big data processing difficult?

  6. 6.

    Discuss on the guidelines for big data processing.

  7. 7.

    Draw an ecosystem for a big data system. Explain functionality of each component.

  8. 8.

    Discuss on hardware and software technology required for big data processing.

  9. 9.

    Make a list of big data tools and note their functionality

  10. 10.

    Discuss on trends in big data research.

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer India

About this chapter

Cite this chapter

Mohanty, H. (2015). Big Data: An Introduction. In: Mohanty, H., Bhuyan, P., Chenthati, D. (eds) Big Data. Studies in Big Data, vol 11. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2494-5_1

Download citation

  • DOI: https://doi.org/10.1007/978-81-322-2494-5_1

  • Published:

  • Publisher Name: Springer, New Delhi

  • Print ISBN: 978-81-322-2493-8

  • Online ISBN: 978-81-322-2494-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics