Advertisement

EPF: A General Framework for Supporting Continuous Top-k Queries Over Streaming Data

  • Hong Jiang
  • Rui ZhuEmail author
  • Bin Wang
Article
  • 28 Downloads

Abstract

Continuous top-k query over sliding window is a fundamental problem in the domain of streaming data management, which monitors the query window and retrieves k objects with the highest scores when the window slides. The key of supporting this query is maintaining a subset of objects in the window, and try to retrieve answers from them when the window slides. The state-of-the-art approach called SAP utilizes the partition technique to support top-k searches. Its key idea is using, as few as possible, high-quality candidates to support the query via finding a proper partition. However, it has to waste relatively high computation cost in evaluating whether the partition is proper and re-scanning the widow. In this paper, we propose an ELM-based framework named EPF, which improves SAP via learning the nature of streaming data. If we learn that the distribution of streaming data is predictable, we could construct a suitable prediction model for a more efficient partition of the window. Furthermore, we propose a novel algorithm to reduce the re-scanning cost. We conduct a thorough experimental study of this technique on real and synthetic datasets and show the significant performance improvement when applying the technique in existing algorithms.

Keywords

ELM stream classification top-k 

Notes

Funding Information

This work is partially supported by the NSF of China under grant Nos. 61702344, 61272178, 61502317, U1401256, and the NSF of China for Key Program under grant No. 61532021.

Compliance with Ethical Standards

Conflict of interests

The authors declare that they have no potential con ict of interest. This article does not contain any studies involving human participants and/or animals by any of the authors. Informed consent was obtained from all individual participants.

References

  1. 1.
    Al-Radaideh QA, Bataineh DQ. A hybrid approach for arabic text summarization using domain knowledge and genetic algorithms. Cogn Comput 2018;10(4):651–669.CrossRefGoogle Scholar
  2. 2.
    Keuninckx L, Danckaert J, van der Sande G. Real-time audio processing with a cascade of discrete-time delay line-based reservoir computers. Cogn Comput 2017;9(3):315–326.CrossRefGoogle Scholar
  3. 3.
    Wang H, Xu L, Wang X, Luo B. Learning optimal seeds for ranking saliency. Cogn Comput 2018; 10(2):347–358.CrossRefGoogle Scholar
  4. 4.
    Oliva J, Serrano JI, Dolores del Castillo M, Iglesias Á. Cross-linguistic cognitive modeling of verbal morphology acquisition. Cogn Comput 2017;9(2):237–258.CrossRefGoogle Scholar
  5. 5.
    Zhang H-G, Wu L, Song Y, Su C-W, Wang Q, Su F. An online sequential learning non-parametric value-at-risk model for high-dimensional time series. Cogn Comput 2018;10(2):187–200.CrossRefGoogle Scholar
  6. 6.
    Wang B, Zhu R, Luo S, Yang X, Guoren W. H-MRST A novel framework for supporting probability degree range query using extreme learning machine. Cogn Comput 2017;9(1):68–80.CrossRefGoogle Scholar
  7. 7.
    Scardapane S, Uncini A. Semi-supervised echo state networks for audio classification. Cogn Comput 2017;9 (1):125–135.CrossRefGoogle Scholar
  8. 8.
    Shen Z, Cheema MA, Lin X, Zhang W, Wang H. 2012. Efficiently monitoring top-k pairs over sliding windows. In: ICDE, pp 798–809.Google Scholar
  9. 9.
    Zhu R, Wang B, Luo S, Yang X, Wang G. Approximate continuous top-k query over sliding window. J Comput Sci Technol 2017;32(1):93–109.CrossRefGoogle Scholar
  10. 10.
    Tong Y, She J, Ding B, Chen L, Wo T, Xu K. Online minimum matching in real-time spatial data E77xperiments and analysis. PVLDB 2016;9(12):1053–1064.Google Scholar
  11. 11.
    Tong Y, She J, Ding B, Wang L, Chen L. 2016. Online mobile micro-task allocation in spatial crowdsourcing. In: 32nd IEEE international conference on data engineering, ICDE 2016, Helsinki, Finland, May 16-20, 2016, pp 49–60.Google Scholar
  12. 12.
    Tarutani Y, Hashimoto K, Hasegawa G, Nakamura Y, Tamura T, Matsuda K, Matsuoka M. 2015. Temperature distribution prediction in data centers for decreasing power consumption by machine learning. In: 7th IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2015, Vancouver, BC, Canada November 30 - December 3, 2015, pp 635–642.Google Scholar
  13. 13.
    Foo YW, Goh C, Li Y. 2016. Machine learning with sensitivity analysis to determine key factors contributing to energy consumption in cloud data centers. In: International conference on cloud computing research and innovations, ICCCRI 2016, Singapore, Singapore, May 4-5, 2016, pp 107–113.Google Scholar
  14. 14.
    Blanchart P, Ferecatu M, Datcu M. 2011. Active learning using the data distribution for interactive image classification and retrieval. In: Proceedings of the IEEE symposium on computational intelligence and data mining, CIDM 2011, part of the IEEE symposium series on computational intelligence 2011, April 11-15, 2011, Paris, France pp 7–14.Google Scholar
  15. 15.
    Huang G-B, Chen L, Siew CK. Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 2006;17:879– 892.CrossRefPubMedGoogle Scholar
  16. 16.
    Huang G-B, Zhu Q-Y, Siew C-K. 2004. Extreme learning machine: a new learning scheme of feedforward neural networks. In: International symposium on neural networks, vol 2.Google Scholar
  17. 17.
    Huang G-B, Zhou H, Ding X, Zhang R. Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern 2012;42:513–529.CrossRefGoogle Scholar
  18. 18.
    Huang G-B, Zhu Q-Y, Siew C-K. Extreme learning machine: Theory and applications. Neurocomputing 2006;70:489–501.CrossRefGoogle Scholar
  19. 19.
    Huang G-B, Ding X, Zhou H. Optimization method based extreme learning machine for classification. Neurocomputing 2010;74:155–163.CrossRefGoogle Scholar
  20. 20.
    Caruana G, Li M, Qi M. 2011. A MapReduce based parallel SVM for large scale spam filtering. In: Fuzzy systems and knowledge discovery.Google Scholar
  21. 21.
    Zhu R, Wang B, Yang X, Zheng B, Wang G. SAP: improving continuous top-k queries over streaming data. IEEE Trans Knowl Data Eng 2017;29(6):1310–1328.CrossRefGoogle Scholar
  22. 22.
    Mouratidis K, Bakiras S, Papadias D. 2006. Continuous monitoring of top-k queries over sliding windows. In: SIGMOD conference, pp 635–646.Google Scholar
  23. 23.
    Yang D, Shastri A, Rundensteiner EA, Ward MO. 2011. An optimal strategy for monitoring top-k queries in streaming windows. In: EDBT, pp 57–68.Google Scholar
  24. 24.
    Deng C, Wang B, Lin W, Huang G-B, Zhao B. Effective visual tracking by pairwise metric learning. Neurocomputing 2017;261:266–275.CrossRefGoogle Scholar
  25. 25.
    Lendasse A, Vong C-M, Toh K-A, Miche Y, Huang G-B. Advances in extreme learning machines (ELM2015). Neurocomputing 2017;261:1–3.CrossRefGoogle Scholar
  26. 26.
    Wang S, Deng C, Lin W, Huang G-B, Zhao B. Nmf-based image quality assessment using extreme learning machine. IEEE Trans Cybern 2017;47(1):232–243.CrossRefPubMedGoogle Scholar
  27. 27.
    Rong H-J, Huang G-B, Sundararajan N, Saratchandran P. Online sequential fuzzy extreme learning machine for function approximation and classification problems. IEEE Trans Syst Man Cybern 2009;39:1067–1072.CrossRefGoogle Scholar
  28. 28.
    Cheng Y, Ye Y, Chen L, Wang G, Giraud-Carrier CG, Sun Y. Distr: A distributed method for the reachability query over large uncertain graphs. IEEE Trans Parallel Distrib Syst 2016;27(11):3172–3185.CrossRefGoogle Scholar
  29. 29.
    Tong Y, She J, Meng R. Bottleneck-aware arrangement over event-based social networks: the max-min approach. World Wide Web 2016;19(6):1151–1177.CrossRefGoogle Scholar
  30. 30.
    Weisstein EW. de moivre-laplace theorem. From MathWorld - A Wolfram Web Resource. http://mathworld.wolfram.com/deMoivre-LaplaceTheorem.html.
  31. 31.
    Cortes C, Vapnik V. Support vector networks. Mach Learn 1995;20:273–297.Google Scholar
  32. 32.
    Fan Y, Qian Y, Soong FK, He L. 2015. Multi-speaker modeling and speaker adaptation for dnn-based TTS synthesis. In: 2015 IEEE international conference on acoustics, speech and signal processing, ICASSP 2015, South Brisbane, Queensland, Australia, April 19-24, 2015, pp 4475–4479.Google Scholar
  33. 33.
    Jourabloo A, Liu X. 2016. Large-pose face alignment via cnn-based dense 3d model fitting. In: 2016 IEEE conference on computer vision and pattern recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pp 4188–4196.Google Scholar
  34. 34.
    Clark S, Dyer C, Blunsom P, Yogatama D, Kuncoro A, Hale J. 2018. Lstms can learn syntax-sensitive dependencies well, but modeling structure makes them better. In: Proceedings of the 56th annual meeting of the association for computational linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Vol 1: Long Papers, pp 1426–1436.Google Scholar
  35. 35.
    Zhang X, Gao T, Gao D. A new deep spatial transformer convolutional neural network for image saliency detection. Design Autom Emb Sys 2018;22(3):243–256.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of ManagementShenyang University of TechnologyShenyangChina
  2. 2.School of Computer ScienceShenyang Aerospace UniversityShenyangChina
  3. 3.College of Computer Science and EngineeringNortheastern UniversityShenyangChina

Personalised recommendations