Abstract
IoT (Internet of Things) based Smart Grid (SG) is defined as a power grid integrated with a large network of smart objects portrayed by information and communication technology. The data sources of IoT-based SG, as well as their correlations, are usually perplexing, which necessitate indexing techniques for complex queries over the SG dataset to efficiently exploit the rich connotations of data to enable characteristic analytics and fault prediction. As part of popular big data platform, HBase is replacing classic relational data- bases to host huge heterogeneous data records in the form of key-value storage. However, most existing secondary index schemes on HBase are managed and retrieved by corresponding data columns instead of queries to incur inefficiency in answering a complex data query. In this paper, we propose an adaptive indexing technique to speed up a complex data query on HBase for IoT-based SG big data. Our proposed method is based on the observation that most analyses over big power grid data focus on data subsets related to specific power grid events or monitoring data instead of the whole dataset. Theoretical analysis and experimental test show that the proposed query-oriented secondary indexing scheme is feasible in improving the query performance. For a join operation, when compared with a query scheme without secondary indexing, our proposed indexing scheme outperforms from a minimum 6.54 × speedup to a maximum 860 × speedup; when compared with a classic secondary indexing scheme implemented on HBase, our indexing scheme outperforms from a minimum 1.20 × speedup to a maximum 8.68 × speedup. Our indexing technique would be a useful reference for other industrial big data practices.
Similar content being viewed by others
References
Monnier, O. (2014). A smart grid with the internet of things. Tech. rep. http://www.ti.com/lit/ml/slyb214/slyb214.pdf.
Qiu, M., Gao, W., Chen, M., Niu, J.W., & Zhang, L. (2011). IEEE Transactions on Smart Grid, 2(4), 715.
Yun, M., & Yuxin, B. (2010). 2010 International conference on advances in energy engineering, ICAEE 2010 (pp. 69–72). doi:10.1109/ICAEE.2010.5557611.
Jammes, F., & Smit, H. (2005). IEEE Transactions on Industrial Informatics, 1(1), 62.
Kaur, M., & Kalra, S. (2016). International Journal of Energy, Information and Communications, 7(3), 11.
Shu-wen, W. (2011). 2011 International conference on electronics, communications and control (ICECC) (pp. 2809–2812). IEEE.
Li, Y., Dai, W., Ming, Z., & Qiu, M. (2016). IEEE Transactions on Computers, 65(5), 1339.
Ma, K., & Yang, B. (2016). Journal of Signal Processing Systems (JSPS). pp. 1–15.
Chen, X., Zhang, C., Ge, B., & Xiao, W. (2015). Proceedings - 2015 IEEE international conference on big data, IEEE Big Data 2015 (pp. 1929–1937). doi:10.1109/BigData.2015.7363970.
George, L. (2011). HBase the Definitive Guide.
Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., & Stoica, I. (2010). HotCloud, 10, 10.
Valentini, G.L., Lassonde, W., Khan, S.U., Min-Allah, N., Madani, S.A., Li, J., Zhang, L., Wang, L., Ghani, N., Kolodziej, J., Li, H., Zomaya, A.Y., Xu, C.Z., Balaji, P., Vishnu, A., Pinel, F., Pecero, J.E., Kliazovich, D., & Bouvry, P. (2013). Cluster Computing, 16(1), 3. doi:10.1007/s10586-011-0171-x.
Shi, W., Zhu, Y., Huang, T., Sheng, G., Lian, Y., Wang, G., & Chen, Y. (2016). Journal of Signal Processing Systems (JSPS). pp. 1–16.
Nishimura, S., Das, S., Agrawal, D., & El Abbadi, A. (2011). 2011 IEEE 12th international conference on mobile data management (Vol. 1, pp. 716). IEEE.
Dittrich, J., Quiané-Ruiz, J.A., Jindal, A., Kargin, Y., Setty, V., & Schad, J. (2010). Proceedings of the VLDB endowment (Vol. 3, pp. 515–529). doi:10.14778/1920841.1920908.
Huawei-Company. HIndex (2013). https://github.com/Huawei-Hadoop/hindex.
Dittrich, J., Quia e Ruiz, J.A., Richter, S., Schuh, S., Jindal, A., Schad, O., Ruiz, J.A.Q., Richter, S., Schuh, S., Jindal, A., & Schad, J. (2012). PVLDB, proceedings of the VLDB endowment, (Vol. 5, pp. 1591–1602). doi:10.14778/2350229.2350272.
Eltabakh, M.Y., Özcan, F., Sismanis, Y., Haas, P.J., Pirahesh, H., & Vondrak, J. (2013). Proceedings of the 16th international conference on extending database technology - EDBT ’13 (p. 89). doi:10.1145/2452376.2452388 http://dl.acm.org/citation.cfm?id=2452376.2452388.
Abouzeid, A., Bajda-Pawlikowski, K., Abadi, D., Silberschatz, A., & Rasin, A. (2009). Proceedings of the VLDB endowment (Vol. 2, p. 922).
Gao, X., & Qiu, J. (2014). Proceedings - 14th IEEE/ACM international symposium on cluster, cloud, and grid computing, CCGrid 2014 (pp. 587–590). doi:10.1109/CCGrid.2014.57.
Cassandra, A. (2015). Apache cassandra.
Chodorow, K. (2013). MongoDB: the definitive guide. O’Reilly Media Inc.
Liu, B., Zhu, Y., Wang, C., Chen, Y., Huang, T., Shi, W., Li, M., & Mao, Y. (2016). IEEE international conference on smart cloud (SmartCloud) (pp. 208–213). IEEE.
Apache hbase reference guide (2012). https://wiki.apache.org/hadoop/Hbase/HbaseArchitecture.
Carstoiu, D., Lepadatu, E., & Gaspar, M. (2010). Computer Science (1986), master in computer science (1990) and PhD in computer science: Citeseer.
Acknowledgments
This research project is funded by the National Key research and development program (2016YFE0100600), the National Natural Science Foundation of China (No. 61373032), the National High Technology and Research Development Program of China (863 Program, 2015AA- 050204), the State Grid Science and Technology Project (520626140020, 14H100000552), State Grid Corporation of China, and the National Research Foundation Singapore under its Campus for Research Excellence and Technological Enterprise (CREATE) programme.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, C., Zhu, Y., Ma, Y. et al. A Query-oriented Adaptive Indexing Technique for Smart Grid Big Data Analytics. J Sign Process Syst 90, 1091–1103 (2018). https://doi.org/10.1007/s11265-017-1269-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-017-1269-z