Skip to main content

Big Streaming Data Sampling and Optimization

  • Conference paper
  • First Online:
IT Convergence and Security 2017

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 449))

  • 1323 Accesses

Abstract

This research addresses and resolves the issues with the confidence level of sampled big streaming data that is dynamic with respect to the speed of the streaming data and the dynamically changing sample space. Based on a preliminary work and results from [8], this research focuses more on the confidence level and threshold of dynamic size of the population in order to ensure a better confidence level of the sampled data with respect to a few variables such as speed of the streaming data, population size dynamic over time, sample space (or size), speed of sampling algorithm, size of streaming data, and time duration of data streaming. Theoretical thresholds of the processing of big streaming data with respect to a set of variables as mentioned above are identified in an effort for optimization. Simulation results along with experimental results are provided to validate the efficacy of the proposed theoretical thresholds.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Tang, F., Li, L., Barolli, L., Tang, C.: An efficient sampling and classification approach for flow detection in SDN-based big data centers. Journal 2(5), 99–110 (2016)

    Google Scholar 

  2. Gadepally, V., Herr, T., Johnson, L., Milechin, L., Milosavljevic, M., Miller, BA.: Sampling operations on big data. In: 2015 49th Asilomar Conference on Signals, Systems and Computers, 8 November 2015

    Google Scholar 

  3. Xu, K., Wang, F., Jia, X., Wang, H.: The impact of sampling on big data analysis of social media: a case study on Flu and Ebola. In: 2015 49th Asilomar Conference on Signals, Systems and Computers, 6 December 2015

    Google Scholar 

  4. Johnson, T., Muralikrishnan, S., Rozenbaum, I.: Sampling algorithms in a stream operator. In: SIGMOD Conference (2005)

    Google Scholar 

  5. Zafar, M.B., Bhattacharya, P., Ganguly, N., Gummadi, K.P., Ghosh, S.: Sampling content from online social networks: comparing random vs. xpert sampling of the twitter stream. ACM Trans. Web 9(3), 12 (2015)

    Article  Google Scholar 

  6. Teddlie, C., Yu, F.: Mixed methods sampling: a topology with examples. J. Mixed Methods Res. 1(1), 77–100 (2007)

    Google Scholar 

  7. Park, B.H., Ostrouchov, G., Samatova, N.F., Geist, A.: Reservoir based random sampling with replacement from data stream. In: Proceedings of 2004 SIAM International Conference on Data Mining (2015)

    Google Scholar 

  8. Kancharla, A., Kim, J., Park, N.-J., Park, N.: Big streaming data buffering optimization. In: International Conference on Computational Science/Intelligence/Applied Informatics (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nohpill Park .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Kancharala, A., Park, N., Kim, J., Park, N. (2018). Big Streaming Data Sampling and Optimization. In: Kim, K., Kim, H., Baek, N. (eds) IT Convergence and Security 2017. Lecture Notes in Electrical Engineering, vol 449. Springer, Singapore. https://doi.org/10.1007/978-981-10-6451-7_27

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-6451-7_27

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-6450-0

  • Online ISBN: 978-981-10-6451-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics