Advertisement

Efficient data retrieval using adaptive clustered indexing for continuous queries over streaming data

  • M. R. Sumalatha
  • M. Ananthi
Article
  • 81 Downloads

Abstract

The Modern era has highly dynamic, heterogeneous and massive data volumes, generated from sensor networks, social media and telecommunications, stock market analyses and the Internet, etc. makes constant query processing quite challenging in processing real-time data, which exist as streams and undergo dynamic changes. Large volumes of data can be efficiently handled by partitioning them into clusters followed by Indexing. An efficient clustering and indexing method is required to process continuous queries for retrieving data streams. A new index structure called adaptive clustering and block-based indexing (ACBBI) is proposed, which is a fusion of cluster-based and block-based techniques to process continuous queries. The incoming data are clustered and stored as blocks using the adaptive clustering method and further indexed by the adaptive indexing approach. Livestock market values that are time variant are used for experimentation. The experimental analysis demonstrates that the ACBBI tree structure significantly decreases half of the space cost, scales better with increasing data size and improves the retrieval rate 30% more than an existing CKDB approach.

Keywords

Data stream Indexing Query processing Data management Clustering Data retrieval Continuous queries 

References

  1. 1.
    Amini, A., Wah, T.Y., Saboohi. H.: On density-based data streams clustering algorithms: a survey. J. Comput. Sci. Technol. 29(1), 116–141 (2014). doi: 10.1007/s11390-013-1416-3
  2. 2.
    Angelov, P., Filev, D.: An approach to online identification of Takagi-Sugeno fuzzy models. IEEE Trans. Syst. Man Cybern. B 34, 484–498 (2004)CrossRefGoogle Scholar
  3. 3.
    Angelov, P.P., Zhou, X.: Evolving fuzzy-rule-based classifiers from data streams. IEEE Trans. Fuzzy Syst. 16(6), 1462–1475 (2008)CrossRefGoogle Scholar
  4. 4.
    Badiozamany, S., Risch, T.: Scalable ordered indexing of streaming data, VLDB Proceedings (2012)Google Scholar
  5. 5.
    Chen, T., Chen, L., Ozsu, M.T.: NongXiao, optimizing multi-Top-k queries over uncertain data streams. IEEE Trans. Knowl. Data Eng. 25(8), 1814–1829 (2013)Google Scholar
  6. 6.
    Deng, X.W., Wang, L., Chen, X., Ranjan, R., Zomaya, A., Chen, D.: Parallel processing of dynamic continuous queries over streaming data flows. IEEE Trans. Parallel Distrib. Syst. 26(3), 834–845 (2015)Google Scholar
  7. 7.
    Ferchichi, A., Gouider, M.S.: BSTree—an incremental indexing structure for similarity search and real time monitoring of data streams. Lecture Notes in Electrical Engineering, Future Information Technology, vol. 276, pp. 185–190. Springer, Heidelberg (2014)Google Scholar
  8. 8.
    Gulisano, V., Jimenez-Peris, R., Patiño-Martínez, M., Soriente, C.: StreamCloud: an elastic and scalable data streaming system. IEEE Trans. Parallel Distrib. Syst. 23(12), 2351–2365 (2012)Google Scholar
  9. 9.
    Hesabi, Z.R., Sellis, T., Zhang, X.: Anytime Concurrent Clustering of Multiple Streams with an Indexing Tree. JMLR: Workshop and Conference Proceedings, vol. 41, pp. 19–32 (2015)Google Scholar
  10. 10.
    Khalilian, M., Mustapha, N.: Data stream clustering: challenges and issues. In: Proceedings of International Multi Conference of Engineers and Computer Scientist IMECS, vol. 1(1) (2010)Google Scholar
  11. 11.
    Kholghi, M., Keyvanpour, M.R.: Comparative evaluation of data stream indexing models.Int. J. Mach. Learn. Comput. 2(3), 257–260 (2012)Google Scholar
  12. 12.
    Kontaki, M., Papadopoulos, A., Manolopoulos, Y.: Continuous trend-based clustering in data streams. Data Warehous. Knowl. Discov. 251–262 (2008)Google Scholar
  13. 13.
    Luan, H., Du, X., Wang, S.: Prefetching, J+ tree: a cache-optimized main memory database index structure. J. Comput. Sci. Technol. 24(4), 687–707 (2009)CrossRefGoogle Scholar
  14. 14.
    Park, J., Hong, B., Ban, C.: An efficient query index on RFID streaming data. J. Inf. Sci. Eng. 25, 921–935 (2009)Google Scholar
  15. 15.
    Patrick Valduriez INRIA, Montpellier, Indexing and Processing Big Data, 2014. http://www.lirmm.fr/mastodons/talks/Valduriez-Bigdata-indexing-2014.pdf
  16. 16.
    Pratama, M., Lu, J., Zhang, G., Anavatti, S.: Evolving type-2 fuzzy classifier. IEEE Trans. Fuzzy Syst. 24(3), 574–589 (2015)Google Scholar
  17. 17.
    Pratama, M., Lu, J., Zhang, G., Anavatti, S.: Scaffolding type-2 classifier for incremental learning under concept drifts. Neurocomputing 191, 304–329 (2016)Google Scholar
  18. 18.
    Pratama, M., Lu, J., Zhang, G., Anavatti. S.: An incremental type-2 meta-cognitive extreme learning machine. IEEE Trans. Cybern. (99) 1–15 (2016)Google Scholar
  19. 19.
    Pratama, M., Anavatti, S., Lughofer, E.: pClass: an effective classifier to streaming examples. IEEE Trans. Fuzzy Syst. 23(2), 369–386 (2014)CrossRefGoogle Scholar
  20. 20.
    Pratama, M., Anavatti, S., Lu, J.: Recurrent classifier based on an incremental meta-cognitive scaffolding algorithm. IEEE Trans. Fuzzy Syst. 23(6), 2048–2066 (2015)CrossRefGoogle Scholar
  21. 21.
    Punithavalli, K.V.M.: Clustering time series data stream—a literature survey. Int. J. Comput. Sci. Inf. Secur. (IJCSIS) 8(1), 289–294 (2010)Google Scholar
  22. 22.
    Saleh, O., Hagedorn, S., Sattler, K.-U.: Processing, complex event, on linked stream data. Datenbank Spektrum 15, 119–129 (2015). doi: 10.1007/s13222-015-0190-5
  23. 23.
    Santoso, B.J., Chiu, G.-M.: Close dominance graph: an efficient framework for answering continuous top-k dominating queries. IEEE Trans. Knowl. Eng. 26(8) 1853–1865 (2014)Google Scholar
  24. 24.
    Shoshani, On the Role of Indexing in Scientific Domains. Big data and Extreme Computing. Lawrence Berkeley National Lab (2013). http://www.exascale.org/bdec/sites/www.exascale.org.bdec/files/17_BDEC_Shoshani_indexing.pdf
  25. 25.
    Silva, J.A., Faria, E.R., Barros, R.C., Hruschka, E.R., De Carvalho, A.C.P.L.F., Gama, J.A.P.: Data stream clustering: a survey. J. ACM 46(1) (2013)Google Scholar
  26. 26.
    Wang, J., Lam, K.-Y., Chang, Y.-H., Hsieh, J.-W., Huang, P.-C.: Block-based multi-version B\(^{+}\)tree for flash-based embedded database systems. IEEE Trans. Comput. 64(4), 925–940 (2015)Google Scholar
  27. 27.
    Xie, Q., Zhang, X., Li, Z., Zhou, X.: Optimizing cost of continuous overlapping queries over data streams by filter adaption. IEEE Trans. Knowl. Data Eng. 28(5), 1258–1271 (2016)CrossRefGoogle Scholar
  28. 28.
    Yogita, D.T.: Clustering techniques for streaming data—a survey. IEEE Conference on Advance Computing Conference (IACC), pp. 951–956 (2013). doi: 10.1109/IAdCC.2013.6514355
  29. 29.
    Zheng, L., Huo, H., Guo, Y., Fang, T.: Supervised adaptive incremental clustering for data stream of chunks. J. Neurocomput. 502–517 (2017). http://dx.doi.org/10.1016/j.neucom.2016.09.054

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Department of Information TechnologyAnna UniversityChennaiIndia
  2. 2.Department of Information TechnologySri Sairam Engineering CollegeChennaiIndia

Personalised recommendations