Skip to main content
Log in

Construing the big data based on taxonomy, analytics and approaches

  • Original Article
  • Published:
Iran Journal of Computer Science Aims and scope Submit manuscript

Abstract

Big data have become an important asset due to its immense power hidden in analytics. Every organization is inundated with colossal amount of data generated with high speed, requiring high-performance resources for storage and processing, special skills and technologies to get value out of it. Sources of big data may be either internal or external to organization, and big data may reside in structured, semi-structured or unstructured form. Artificial intelligence, Internet of Things, and social media are contributing to the growth of big data. Analytics is the use of statistics, maths, and machine learning to derive meaningful insights from data to make timely decisions and enable data-driven organization of the future. This paper sheds light upon big data, taxonomy of data, and hierarchical journey of data from its original form to the high level understanding in terms of wisdom. The paper also focuses on key characteristics of big data and challenges of handling big data. In addition, big data storage systems have also been briefly covered to get the idea on how storage systems help to accommodate the requirements of big data. This paper scrupulously articulates the eras of evolution of analytics varying from descriptive, predictive and prescriptive analytics. Process models used for inferring information from data have been compared and their applicability for analyzing big data has also been sought. Finally, recent developments carried in the domain of big data and analytics are compared based on the state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Big data universe. http://www.csc.com/insights/flxwd/78931-big_data_universe_beginning_to_explode. Accessed 2 Mar 2018

  2. Closed, shared, open data. https://theodi.org/blog/closed-shared-open-data-whats-in-a-name. Accessed 5 Mar 2018

  3. Data and services. http://www.icsu-wds.org/services/data-portal. Accessed 5 Mar 2018

  4. Archives. https://www.archives.gov/open. Accessed 5 Mar 2018

  5. DBPedia. http://wiki.dbpedia.org/. Accessed 5 Mar 2018

  6. Freebase. http://www.freebase.com/. Accessed 5 Mar 2018

  7. Hey, J.: The data, information, knowledge, wisdom chain: the metaphorical link. Intergov Oceanogr Comm 26, 1–18 (2004)

    Google Scholar 

  8. Frické, M.: The knowledge pyramid: a critique of the DIKW hierarchy. J. Inf. Sci. 35, 131–142 (2009)

    Article  Google Scholar 

  9. NIST big data interoperability framework. https://bigdatawg.nist.gov/_uploadfiles/NIST.SP.1500-1.pdf. Accessed 5 Mar 2018

  10. Resource description framework. https://www.w3.org/TR/rdfa-primer/. Accessed 5 Mar 2018

  11. Schema. http://schema.org/. Accessed 5 Mar 2018

  12. Microformats. http://microformats.org/. Accessed 5 Mar 2018

  13. Microdata. https://www.w3.org/TR/microdata/. Accessed 5 Mar 2018

  14. Unstructured data and the 80 percent rule. https://breakthroughanalysis.com/2008/08/01/unstructured-data-and-the-80-percent-rule/. Accessed 5 Mar 2018

  15. Chen, M., Mao, S., Liu, Y.: Big data: a survey. Mob Netw. Appl 19, 171–209 (2014)

    Article  Google Scholar 

  16. Connolly, T.M., Begg, C.E.: Database systems: a practical approach to design, implementation, and management. Pearson Education (2005)

  17. Abiteboul, S.: Querying semi-structured data. In proceedings of the 6th international conference on database theory, pp. 1–18. Springer, Berlin (1997)

    Google Scholar 

  18. Vs of big data. https://www.elderresearch.com/company/blog/42-v-of-big-data. Accessed 15 Mar 2018

  19. Gartner IT glossary. http://www.gartner.com/it-glossary/big-data/. Accessed 15 Mar 2018

  20. IDC. http://uk.emc.com/collateral/analyst-reports/idc-extracting-value-from-chaos-ar.pdf. Accessed 15 Mar 2018

  21. NIST. http://dx.doi.org/10.6028/NIST.SP.1500-1. Accessed 15 Mar 2018

  22. IBM. http://www.ibmbigdatahub.com/infographic/four-vs-big-data. Accessed 15 Mar 2018

  23. Enterprise architects. http://enterprisearchitects.com/the-5v-s-of-big-data/. Accessed 15 Mar 2018

  24. Impact radius. https://www.impactradius.com/blog/7-vs-big-data/. Accessed 15 Mar 2018

  25. Data science central. https://www.datasciencecentral.com/profiles/blogs/how-many-v-s-in-big-data-the-characteristics-that-define-big-data. Accessed 15 Mar 2018

  26. MapR data technologies. https://mapr.com/blog/top-10-big-data-challenges-serious-look-10-big-data-vs/. Accessed 15 Mar 2018

  27. Digital universe. https://www.computerworld.com/article/2493701/data-center/by-2020–there-will-be-5-200-gb-of-data-for-every-person-on-earth.html. Accessed 15 Mar 2018

  28. ISO: ISO/IEC 25012: standardization/international electrotechnical commission, I. O. & others. Software engineering-Software product quality requirements and evaluation (SQuaRe) data quality model. ISO/IEC 25012, 1–13 (2008)

    Google Scholar 

  29. Merino, J., Caballero, I., Rivas, B., Serrano, M., Piattini, M.: A data quality in use model for big data. Future Gener. Comput. Syst. 63, 123–130 (2016)

    Article  Google Scholar 

  30. Manyika, J., et al.: Big data: The next frontier for innovation, competition, and productivity (2011)

  31. Addressing five emerging challenges of big data. https://www.progress.com/docs/default-source/default-document-library/Progress/Documents/Papers/Addressing-Five-Emerging-Challenges-of-Big-Data.pdf. Accessed 20 Mar 2018

  32. In-memory database market. http://www.marketsandmarkets.com/Market-Reports/in-memory-database-market-226589254.html. Accessed 24 Mar 2018

  33. FastPath. https://www.ibm.com/us-en/marketplace/ims-fast-path-solution-pack. Accessed 24 Mar 2018

  34. TimesTen. http://www.oracle.com/technetwork/database/database-technologies/timesten/overview/index.html. Accessed 24 Mar 2018

  35. Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. In Mass storage systems and technologies (MSST), 2010 IEEE 26th symposium on 1–10. (2010)

  36. Rise of analytics 3.0. http://www.strimgroup.com/wp-content/uploads/pdf/Davenport_IIA_analytics30_2013.pdf. Accessed 4 Apr 2018

  37. Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R.: Advances in knowledge discovery and data mining, vol. 21. AAAI press, Menlo Park (1996)

    Google Scholar 

  38. Wirth, R. Hipp, J.: CRISP-DM: towards a standard process model for data mining. In Proceedings of the 4th international conference on the practical applications of knowledge discovery and data mining, 29–39 (2000)

  39. Olson, D.L., Delen, D.: Data mining process. Advanced Data Mining Techniques, pp. 9–35. Springer, Berlin Heidelberg (2008)

    Chapter  Google Scholar 

  40. Li, Y., Thomas, M.A., Osei-Bryson, K.-M.: A snail shell process model for knowledge discovery via data analytics. Decis. Support Syst. 91, 1–12 (2016)

    Article  Google Scholar 

  41. Wei, J., Zhao, Y., Jiang, K., Xie, R., Jin, Y.: Analysis farm: a cloud-based scalable aggregation and query platform for network log analysis. In 2011 International Conference on Cloud and Service Computing, 354–359 (2011)

  42. He, Y., et al.: RCFile: a fast and space-efficient data placement structure in MapReduce-based warehouse systems. In 2011 IEEE 27th International Conference on Data Engineering, 1199–1208 (2011)

  43. Lee, R., et al.: YSmart: yet another SQL-to-MapReduce Translator. In 2011 31st International Conference on Distributed Computing Systems, 25–36 (2011)

  44. Candea, G., Polyzotis, N., Vingralek, R.: Predictable performance and high query concurrency for data analytics. VLDB J. 20, 227–248 (2011)

    Article  Google Scholar 

  45. Beheshti, S.-M.-R., Benatallah, B., Motahari-Nezhad, H.R.: Scalable graph-based OLAP analytics over process execution data. Distrib. Parallel Databases 34, 379–423 (2016)

    Article  Google Scholar 

  46. Zhong, R.Y., et al.: A big data approach for logistics trajectory discovery from RFID-enabled production data. Int. J. Prod. Econ. 165, 260–272 (2015)

    Article  Google Scholar 

  47. Song, J., et al.: HaoLap: a Hadoop based OLAP system for big data. J. Syst. Softw. 102, 167–181 (2015)

    Article  Google Scholar 

  48. Romero, O., Herrero, V., Abelló, A., Ferrarons, J.: Tuning small analytics on big data: data partitioning and secondary indexes in the Hadoop ecosystem. Inf. Syst. 54, 336–356 (2015)

    Article  Google Scholar 

  49. Wu, D., et al.: A pipeline framework for heterogeneous execution environment of big data processing. IEEE Softw. (2018). https://doi.org/10.1109/MS.2016.62

    Article  Google Scholar 

  50. Singh, S., Liu, Y.: A cloud service architecture for analyzing big monitoring data. Tsinghua Sci. Technol. 21, 55–70 (2016)

    Article  Google Scholar 

  51. Zhu, J., et al.: A framework-based approach to utility big data analytics. IEEE Trans. Power Syst. 31, 2455–2462 (2016)

    Article  Google Scholar 

  52. Tuarob, S., Bhatia, S., Mitra, P., Giles, C.L.: AlgorithmSeer: a system for extracting and searching for algorithms in scholarly big data. IEEE Trans. Big Data 2, 3–17 (2016)

    Article  Google Scholar 

  53. Yuan, W., Deng, P., Taleb, T., Wan, J., Bi, C.: An unlicensed taxi identification model based on big data analysis. IEEE Trans. Intell. Trans. Syst. 17, 1703–1713 (2016)

    Article  Google Scholar 

  54. Wylot, M., Cudré-Mauroux, P.: Diplocloud: EFFICIENT and scalable management of rdf data in the cloud. IEEE Trans. Knowl. Data Eng. 28, 659–674 (2016)

    Article  Google Scholar 

  55. Alsheikh, M.A., Niyato, D., Lin, S., Tan, H.-P., Han, Z.: Mobile big data analytics using deep learning and apache spark. IEEE Netw. 30, 22–29 (2016)

    Article  Google Scholar 

  56. Kang, Y.-S., Park, I.-H., Rhee, J., Lee, Y.-H.: MongoDB-based repository design for IoT-generated RFID/sensor big data. IEEE Sens. J. 16, 485–497 (2016)

    Article  Google Scholar 

  57. Ke, H., Li, P., Guo, S., Guo, M.: On traffic-aware partition and aggregation in mapreduce for big data applications. IEEE Trans. Parallel Distrib. Syst. 27, 818–828 (2016)

    Article  Google Scholar 

  58. Basiri, S., Ollila, E., Koivunen, V.: Robust, scalable, and fast bootstrap method for analyzing large scale data. IEEE Trans. Signal Process. 64, 1007–1017 (2016)

    Article  MathSciNet  Google Scholar 

  59. Zhang, L., Lin, J., Karim, R.: Sliding window-based fault detection from high-dimensional data streams. IEEE Trans. Syst. Man Cybern. Syst. 47, 289–303 (2017)

    Google Scholar 

  60. Hochbaum, D.S., Baumann, P.: Sparse computation for large-scale data mining. IEEE Trans. Big Data 2, 151–174 (2016)

    Article  Google Scholar 

  61. Belcastro, L., Marozzo, F., Talia, D., Trunfio, P.: Using scalable data mining for predicting flight delays. ACM Trans. Intell. Syst. Technol. 8, 5 (2016)

    Article  Google Scholar 

  62. Pham, H., Shahabi, C., Liu, Y.: Inferring social strength from spatiotemporal data. ACM Trans. Database Syst. 41, 7 (2016)

    Article  MathSciNet  Google Scholar 

  63. Xie, D., et al.: Simba: efficient in-memory spatial analytics. In Proceedings of the 2016 International Conference on Management of Data, 1071–1085 (2016)

  64. Agrawal, D., et al.: Rheem: enabling multi-platform task execution. In Proceedings of the 2016 International Conference on Management of Data, 2069–2072 (2016)

  65. Zhang, Q., Yan, D., Cheng, J.: Quegel: a general-purpose system for querying big graphs. In Proceedings of the 2016 International Conference on Management of Data, 2189–2192 (2016)

  66. Zhang, Y., et al.: DataLab: a version data management and analytics system. In Proceedings of the 2nd International Workshop on BIG Data Software Engineering, 12–18 (2016)

  67. Wang, H., Kifer, D., Graif, C., Li, Z.: Crime rate inference with big data. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 635–644 (2016)

  68. Carey, M. J., Jacobs, S., Tsotras, V. J., Breaking, B.A.D.: A data serving vision for big active data. In Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems, 181–186 (2016)

  69. Shkapsky, A., et al.: Big data analytics with datalog queries on spark. In Proceedings of the 2016 International Conference on Management of Data, 1135–1149 (2016)

  70. Tang, J., Liu, J., Zhang, M., Mei, Q.: Visualizing large-scale and high-dimensional data. In Proceedings of the 25th International Conference on World Wide Web, 287–297 (2016)

  71. Liu, X., Nielsen, P.S.: A hybrid ICT-solution for smart meter data analytics. Energy 115, 1710–1722 (2016)

    Article  Google Scholar 

  72. Ahmad, A., Paul, A., Rathore, M.M.: An efficient divide-and-conquer approach for big data analytics in machine-to-machine communication. Neurocomputing 174, 439–453 (2016)

    Article  Google Scholar 

  73. Hall, R.J.: Tools for predicting the reliability of large-scale storage systems. Trans. Storage. 12, 241–2430 (2016)

    Article  Google Scholar 

  74. Gulzar, M. A., et al.: BigDebug: debugging Primitives for Interactive Big Data Processing in Spark. In 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE), 784–795 (2016)

  75. Xia, Q., Liang, W., Xu, Z.: Data locality-aware big data query evaluation in distributed clouds. Comput. J. 60, 791–809 (2017)

    Article  Google Scholar 

  76. Akbar, A., Khan, A., Carrez, F., Moessner, K.: Predictive analytics for complex IoT data streams. IEEE Internet Things J. 4, 1571–1582 (2017)

    Article  Google Scholar 

  77. Li, H., Lu, K., Meng, S.: Bigprovision: a provisioning framework for big data analytics. IEEE Netw. 29, 50–56 (2015)

    Article  Google Scholar 

  78. Esposito, C., Ficco, M., Palmieri, F., Castiglione, A.: A knowledge-based platform for big data analytics based on publish/subscribe services and stream processing. Knowl Based Syst. 79, 3–17 (2015)

    Article  Google Scholar 

  79. Wang, J., Zhang, X., Yin, J., Wu, H., Han, D.: Speed up big data analytics by unveiling the storage distribution of sub-datasets. IEEE Trans., Big Data (2017)

    Google Scholar 

  80. Yu, Z., et al.: MIA: metric importance analysis for big data workload characterization. IEEE Trans. Parallel Distrib., Syst (2017)

    Google Scholar 

  81. Balliu, A., Olivetti, D., Babaoglu, O., Marzolla, M., Sîrbu, A.: A big data analyzer for large trace logs. Computing 98, 1225–1249 (2016)

    Article  MathSciNet  Google Scholar 

  82. Yin, J., Liao, Y., Baldi, M., Gao, L., Nucci, A.: GOM-Hadoop: a distributed framework for efficient analytics on ordered datasets. J. Parallel Distrib. Comput. 83, 58–69 (2015)

    Article  Google Scholar 

  83. Al-Ali, A.R., Zualkernan, I.A., Rashid, M., Gupta, R., Alikarar, M.: A smart home energy management system using IoT and big data analytics approach. IEEE Trans. Consum. Electron. 63, 426–434 (2017)

    Article  Google Scholar 

  84. Wu, P.Y., et al.: Omic and electronic health record big data analytics for precision medicine. IEEE Trans. Biomed. Eng. 64, 263–273 (2017)

    Article  Google Scholar 

  85. Triguero, I., et al.: ROSEFW-RF: The winner algorithm for the ECBDL′14 big data competition: an extremely imbalanced big data bioinformatics problem. Knowl Based Syst. 87, 69–79 (2015)

    Article  Google Scholar 

  86. Blockchain. https://towardsdatascience.com/blockchain-and-big-data-the-match-made-in-heavens-337887a0ce73. Accessed 10 May 2018

  87. Ghofrani, F., He, Q., Goverde, R.M.P., Liu, X.: Recent applications of big data analytics in railway transportation systems: a survey. Trans. Res. Part C Emerg. Technol. 90, 226–246 (2018)

    Article  Google Scholar 

  88. Ip, R.H.L., Ang, L.-M., Seng, K.P., Broster, J.C., Pratley, J.E.: Big data and machine learning for crop protection. Comput. Electron. Agric. 151, 376–383 (2018)

    Article  Google Scholar 

  89. Robot trailed on farm. https://horticulture.com.au/foreign-body-detection-robot-trialled-on-gattonfarm. Accessed 10 May 2018

  90. Krizhevsky, A., Sutskever, I., Hinton, G. E.: Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, p. 1097–1105 (2012)

  91. Pathak, A.R., Pandey, M., Rautaray, S.: Application of deep learning for object detection. Procedia Comput. Sci. 132, 1706–1717 (2018)

    Article  Google Scholar 

  92. Pathak, A. R., Pandey, M., Rautaray, S.: Deep learning approaches for detecting objects from images: a review. In Progress in Computing, Analytics and Networking, p. 491–499 (2018)

    Google Scholar 

  93. Pathak, A.R., Pandey, M., Rautaray, S., Pawar, K.: Assessment of object detection using deep convolutional neural networks. Intell Comput Information and Comm 693, 457–466 (2018)

    Article  Google Scholar 

  94. Pawar, K., Attar, V.: Deep learning approaches for video-based anomalous activity detection. World Wide Web. (2018). https://doi.org/10.1007/s11280-018-0582-1

    Article  Google Scholar 

  95. Socher, R., Huang, E.H., Pennin, J., Manning, C.D., Ng, A.Y.: Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In: NIPS'11 Proceedings of the 24th International Conference on Neural Information Processing Systems. Curran Associates Inc., Granada, Spain, pp. 801–809 (2011)

    Google Scholar 

  96. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: NIPS'14 Proceedings of the 27th International Conference on Neural Information Processing Systems, vol 2. MIT Press, Montreal, Canada, pp. 3104–3221 (2014)

    Google Scholar 

  97. Bordes, A., Glorot, X., Weston, J., Bengio, Y.: Joint learning of words and meaning representations for open-text semantic parsing. Proc Fifteenth Int Conf on Artif Intell Stat 22, 127–135 (2012)

    Google Scholar 

  98. Graves, A., Mohamed, A., Hinton G.: Speech recognition with deep recurrent neural networks. In Acoustics, speech and signal processing, IEEE International Conference on, 2013. p. 6645–6649 (2013)

  99. Wang, J., Wang, K., Wang, Y., Huang, Z., Xue, R.: Deep Boltzmann machine based condition prediction for smart manufacturing. J. Ambient Intell. Humaniz. Comput. (2018). https://doi.org/10.1007/s12652-018-0794-3

    Article  Google Scholar 

  100. Hernández, Á.B., Perez, M.S., Gupta, S., Muntés-Mulero, V.: Using machine learning to optimize parallelism in big data applications. Future Gener. Comput. Syst. 86, 1076–1092 (2018)

    Article  Google Scholar 

  101. Shin, C.-K., Yun, U.T., Kim, H.K., Park, S.C.: A hybrid approach of neural network and memory-based learning to data mining. IEEE Trans. Neural Netw. 11, 637–646 (2000)

    Article  Google Scholar 

  102. Yan, Y., Yin, X.-C., Zhang, B.-W., Yang, C., Hao, H.-W.: Semantic indexing with deep learning: a case study. Big Data Anal. 1(1), 7 (2016)

    Article  Google Scholar 

  103. Marz, N., Warren, J.: A new paradigm for Big Data. Big data princ. best Pract. scalable real-time data syst. Manning Publications, Shelter Island (2014)

    Google Scholar 

  104. Questioning the lambda architecture. http://radar.oreilly.com/2014/07/questioning-the-lambdaarchitecture.html. Accessed 14 May 2018

  105. Pawar, K., Attar, V.: A survey on data analytic platforms for internet of things. In Computing, Analytics and Security Trends (CAST), International Conference on 605–610 (2016)

  106. Glorot, X., Bordes, A., Bengio, Y.: Domain adaptation for large-scale sentiment classification: a deep learning approach. In Proceedings of the 28th International Conference on Machine Learning (ICML), 513–520 (2011)

  107. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

  108. Sabokrou, M., Fayyaz, M., Fathy, M., Moayed, Z., Klette, R.: Deep-anomaly: Fully convolutional neural network for fast anomaly detection in crowded scenes. Comput. Vis. Image Underst. (2018). https://doi.org/10.1109/TIP.2017.2670780

    Article  Google Scholar 

  109. Tableau. https://www.tableau.com. Accessed 14 Apr 2018

  110. Qlikview. https://www.qlik.com/us/products/qlikview. Accessed 14 Apr 2018

  111. Highcharts. https://www.highcharts.com. Accessed 14 Apr 2018

  112. Datawrapper. https://www.datawrapper.de. Accessed 14 Apr 2018

  113. FusionCharts. https://www.fusioncharts.com. Accessed 14 Apr 2018

  114. Plotly. https://plot.ly. Accessed 14 Apr 2018

  115. Sisense. https://www.sisense.com. Accessed 14 Apr 2018

  116. TensorFlow. https://www.tensorflow.org. Accessed 14 Apr 2018

  117. Alipourfard, O., et al.: CherryPick: adaptively unearthing the best cloud configurations for big data analytics. NSDI 2, 2–4 (2017)

    Google Scholar 

  118. Sinnott, R.O., Voorsluys, W.: A scalable cloud-based system for data-intensive spatial analysis. Int. J. Softw. Tools Technol. Trans. 18, 587–605 (2016)

    Article  Google Scholar 

  119. Zhang, P., Yu, K., Yu, J.J., Khan, S.U.: QuantCloud: big data infrastructure for quantitative finance on the cloud. IEEE Trans. Big Data 4, 368–380 (2018)

    Article  Google Scholar 

  120. Hashem, I.A.T., et al.: The rise of ‘big data’ on cloud computing: review and open research issues. Inf. Syst. 47, 98–115 (2015)

    Article  Google Scholar 

  121. Doersch, C., Gupta, A., Efros, A. A.: Unsupervised visual representation learning by context prediction. In Proceedings of the IEEE International Conference on Computer Vision, 1422–1430 (2015)

  122. Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Is object localization for free?-weakly-supervised learning with convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 685–694 (2015)

  123. Sutton, R.S., Barto, A.G.: Introduction to reinforcement learning, vol. 135. MIT press, Cambridge (1998)

    Google Scholar 

  124. Pang, B., Lee, L. A: sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. 271 (2004)

  125. Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, 347–354 (2005)

  126. Pontiki M., et al.: SemEval-2016 task 5: aspect based sentiment analysis. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), 19–30 (2015)

  127. Schouten, K., Frasincar, F.: Survey on aspect-level sentiment analysis. IEEE Trans. Knowl. Data Eng. 28, 813–830 (2016)

    Article  Google Scholar 

  128. Chen, W., Zhang, Y., Yeo, C.K., Lau, C.T., Lee, B.S.: Unsupervised rumor detection based on users’ behaviors using neural networks. Pattern Recognit. Lett. 105, 226–233 (2018)

    Article  Google Scholar 

  129. Sen I., et al.: Worth its weight in likes: towards detecting fake likes on Instagram. In Proceedings of the 10th ACM Conference on Web Science, 205–209 (2018)

  130. Upmanyu, M., Namboodiri, A.M., Srinathan, K., Jawahar, C.V.: Blind authentication: a secure crypto-biometric verification protocol. IEEE Trans. Inf. Forensics Secur. 5, 255–268 (2010)

    Article  Google Scholar 

  131. Upmanyu M., Namboodiri A. M., Srinathan K., Jawahar C. V.: Efficient privacy preserving video surveillance. In Computer Vision, 2009 IEEE 12th International Conference on 1639–1646 (2009)

  132. Amazon mechanical turk: https://www.mturk.com/. Accessed 20 Apr 2018

  133. Raykar V, Agrawal P.: Sequential crowdsourced labeling as an epsilon-greedy exploration in a Markov decision process. In: Kaski S., Corander J (eds) Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics 33, 832–840 (PMLR 2014)

  134. Deep learning with synthetic data will democratize the tech industry. https://techcrunch.com/2018/05/11/deep-learning-with-synthetic-data-will-democratize-the-tech-industry/. Accessed 20 Apr 2018

  135. Distante A., Marino F., Mazzeo, P. L., Nitti, M., Stella, E.: Automatic Method and System for Visual Inspection of Railway Infrastructure. (2009)

  136. Wei, S., et al.: Exploring the potential of open big data from ticketing websites to characterize travel patterns within the Chinese high-speed rail system. PLoS ONE 12, 1–13 (2017)

    Google Scholar 

  137. Wilkinson, M.D., et al.: The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 9 (2016)

    Article  Google Scholar 

  138. Smith, K., et al.: ‘Big Metadata’: the need for principled metadata management in big data ecosystems. In Proceedings of Workshop on Data Analytics in the Cloud 13:1–13:4 (ACM, 2014)

  139. Analytics. https://idc-community.com/groups/it_agenda/bigdataanalytics/unlocking_the_hidden_value_of_information. Accessed 20 Apr 2018

  140. Rodrigues, B., Bocek, T., Stiller, B.: The use of blockchains: application-driven analysis of applicability. In: Advances in computers. Elsevier (2018). https://doi.org/10.1016/bs.adcom.2018.03.011

    Google Scholar 

  141. Brahma, PP., Huang Q., Wu D.: Structured memory based deep model to detect as well as characterize novel inputs; 2018. arXiv:1801.09859

  142. Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., Bengio, Y.: Quantized neural networks: training neural networks with low precision weights and activations. J. Mach. Learn. Res. 18, 6869–6898 (2017)

    MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ajeet Ram Pathak.

Appendix

Appendix

Table 8 shows the list of abbreviations used in paper.

Table 8 List of abbreviations

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pathak, A.R., Pandey, M. & Rautaray, S. Construing the big data based on taxonomy, analytics and approaches. Iran J Comput Sci 1, 237–259 (2018). https://doi.org/10.1007/s42044-018-0024-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42044-018-0024-3

Keywords

Navigation