Skip to main content

Approaches to Optimize Memory Footprint for Elephant Flows

  • Conference paper
  • First Online:
Proceedings of Data Analytics and Management

Abstract

The technology revisions are too quick to resist or stay apart. Most of the work, on massive and streaming data, tries to scale up the computation. The scale up and scale out comes with its own set of advantages and disadvantages. The computational hardware has already been used to its limits with best of the algorithms. Our work proposes and implements two approaches: (a) memory bound and (b) ingestion bound that optimizes memory footprint and speeds up the computation. The work compares the stated techniques on two datasets: (a) moma and (b) yelp and found up to 80% conditional optimization. The work shows results of optimization for big data stream and proves that the approach is worth implementing as compared to other state of the art techniques for big data stream aka Elephant Flows.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 279.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 279.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Brady HE (2019) The challenge of Big Data and data science. Annu Rev Polit Sci 22(1):297–323

    Article  Google Scholar 

  2. Budiman AR, Fanany MI, Basaruddin C (2017) Adaptive parallel ELM with convolutional features for Big Stream data. Faculty of Computer Science, University of Indonesia, Indonesia

    Google Scholar 

  3. CERN (2021) Worldwide LHC computing grid. Retrieved Jan 07 2021, from WLCG: https://wlcg.web.cern.ch/

  4. Github (2016) Museum of modern art. Retrieved Oct 5 2020, from Github: https://github.com/MuseumofModernArt/exhibitions/blob/master/MoMAExhibitions1929to1989.csv

  5. Kaggle (2020) Yelp Rev. Retrieved Oct 7 2020, from Kaggle: https://www.kaggle.com/yelp-dataset/yelp-dataset

  6. Kolajo T, Daramola O, Adebiyi A (2019) Big data stream analysis: a systematic literature review. J Big Data 6(1):1–30

    Article  Google Scholar 

  7. Kumar V, Sharma DK, Mishra VK (2020) Visualizing Big Data with mixed reality. In: System modeling and advancement in research trends. Mathura, India: IEEE, pp 85–90

    Google Scholar 

  8. Kumar V, Sharma DK, Mishra VK (2021) Mille Cheval framework: a GPU-based in-memory high-performance computing framework for accelerated processing of Big-data streams. J Supercomput 77(3):1–25

    Google Scholar 

  9. Kumar V, Sharma D, Mishra V (2021) Optimization and performance measurement model for massive data streams. In: Futuristic trends in network and communication technologies. Springer, Taganrog, Russia, pp 350–359

    Google Scholar 

  10. Medeiros D, Neto H, Lopez M, Magalhães L, Fernandes N, Vieira A, Silva E, Mattos D (2020) A survey on data analysis on large-scale wireless networks: online stream processing, trends, and challenges. J Internet Serv Appl 11(1):1–48

    Google Scholar 

  11. Mehmood E, Anees T (2020) Challenges and solutions for processing real-time Big Data stream: a systematic literature review. IEEE Access 8:119123–119143

    Article  Google Scholar 

  12. Mohanty S, Sharma R, Saxena M, Saxena A (2021) Heuristic approach towards COVID-19: Big data analytics and classification with natural language processing. In: Data analytics and management. Springer, Singapore, pp 775–791

    Google Scholar 

  13. Muthukrishnan S (2005) Data streams: algorithms and applications. Found Trends Theor Comput Sci

    Google Scholar 

  14. Pishgoo B, Azirani AA, Raahemi B (2021) A hybrid distributed batch-stream processing approach for anomaly detection. Inf Sci 543:309–327

    Article  Google Scholar 

  15. Sahal R, Breslin JG, Ali MI (2020) Big data and stream processing platforms for Industry 4.0 requirements mapping for a predictive maintenance use case. J Manuf Syst 54:138–151

    Article  Google Scholar 

  16. Tantalaki N, Souravlas S, Roumeliotis M (2020) A review on big data real-time stream processing and its scheduling techniques. Int J Parallel Emergent Distrib Syst 35(5):571–601

    Article  Google Scholar 

  17. UCASE Software Engineering Research Group (2020) A stream processing architecture for heterogeneous data sources in the internet of things. Comput Stand Interfaces 70:103426

    Google Scholar 

  18. ur Rehman MH, Liew CS, Abbas A, Jayaraman PP, Wah TY, Khan SU (2016) Big data reduction methods: a survey. Data Sci Eng 1(4):265–284

    Google Scholar 

  19. Wibisono A, Mursanto P, Adibah J, Bayu WD, Rizki MI, Hasani LM, Ahli VF (2020) Distance variable improvement of time-series big data stream evaluation. J Big Data 7(1):1–13

    Article  Google Scholar 

  20. Zhang F, Yang L, Zhang S, He B, Lu W, Du X (2020) FineStream: fine-grained window-based stream processing on CPU-GPU integrated architectures. In: {USENIX} Annual technical conference, pp 633–647

    Google Scholar 

  21. Zhang S, Zhang F, Wu Y, He B, Johns P (2020) Hardware-conscious stream processing: a survey. ACM SIGMOD Rec 48(4):18–29

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kumar, V., Sharma, D.K., Mishra, V.K. (2022). Approaches to Optimize Memory Footprint for Elephant Flows. In: Gupta, D., Polkowski, Z., Khanna, A., Bhattacharyya, S., Castillo, O. (eds) Proceedings of Data Analytics and Management . Lecture Notes on Data Engineering and Communications Technologies, vol 91. Springer, Singapore. https://doi.org/10.1007/978-981-16-6285-0_34

Download citation

Publish with us

Policies and ethics