Abstract
The volume and variety of data being generated using computers is doubling every two years. It is estimated that in 2015, 8 Zettabytes (Zetta=1021) were generated which consisted mostly of unstructured data such as emails, blogs, Twitter, Facebook posts, images, and videos. This is called big data. It is possible to analyse such huge data collections with clusters of thousands of inexpensive computers to discover patterns in the data that have many applications. But analysing massive amounts of data available in the Internet has the potential of impinging on our privacy. Inappropriate analysis of big data can lead to misleading conclusions. In this article, we explain what is big data, how it is analysed, and give some case studies illustrating the potentials and pitfalls of big data analytics.
Similar content being viewed by others
Suggested Reading
H J Watson, Tutorial: Big data analytics: Concepts, technologies, and applications, Communication of the Association for Information Systems, Vol.34, Article 65, pp.124–168, 2014.
NIST definition of big data and data science, www.101.datascience. community/2015/nist-defines-big-data-and-data-science.
J S Ward and A Baker, A Survey of Big Data Definitions, arxiv.org/abs/1309.5821 VI.
T Hey, S Tansley and K Tolle (Editors), The Fourth Paradigm–Data-Intensive Scientific Discovery, Microsoft Research, Richmond, WA., USA, 2009.
R P Srikant, 8 Innovative examples of big data usage in India, Dataquest, August 20, 2016.
C Metz, In a huge breakthrough, Google’s AI beats a top player at the game of GO, www.wired.com/2016/01/in-a-huge-breakthrough-googlesai-beats-a-GO-champion/
M H Hassoun, Fundamentals of Artificial Neural Networks, Prentice-Hall of India, 1998.
Big Data Hadoop Tutorial, www.tutorial point.com/hadoop/ hadooptutorial.pdf
V Rajaraman and C Siva Ram Murthy, Parallel Computers: Architecture and Programming, 2nd Edition, Chapter 6, PHI Learning, New Delhi, 2016.
S Smith, Data privacy: Now you see me; New model of data sharing: Modern governance and statisticians, Significance, Vol.11, No.4, pp.10–17, Oct 2014.Available atURL:http://onlinelibrary.wiley.com/doi/10.1111/j.1740-9713.2014.00762.x/full
D Lazer, et al, The parable of Google Flu: Traps in big data analysis, Science, Vol.343, April 2014.
Author information
Authors and Affiliations
Corresponding author
Additional information
V Rajaraman is at the Indian Institute of Science, Bengaluru. Several generations of scientists and engineers in India have learnt computer science using his lucidly written textbooks on programming and computer fundamentals. His current research interests are parallel computing and history of computing.
Rights and permissions
About this article
Cite this article
Rajaraman, V. Big data analytics. Reson 21, 695–716 (2016). https://doi.org/10.1007/s12045-016-0376-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12045-016-0376-7