Abstract
Cyber-physical systems interconnect the cyber world with the physical world in which sensors are massively networked to monitor the physical world. Various services are expected to be able to use sensor data reflecting the physical world with information technology. Given this expectation, it is important to simultaneously provide timely access to massive data and reduce storage costs. We propose a data storage scheme for storing and querying massive sensor data. This scheme is scalable by adopting a distributed architecture, fault-tolerant even without costly data replication, and enables users to efficiently select multi-scale random data samples for statistical analysis. We implemented a prototype system based on our scheme and evaluated its sampling performance. The results show that the prototype system exhibits lower latency than a conventional distributed storage system.
Keywords
- data accuracy
- random sampling
- relaxed durability
Download conference paper PDF
References
Lee, E.A.: Cyber Physical Systems: Design Challenges. In: 2008 11th IEEE International Symposium on Object Oriented Real-Time Distributed Computing (ISORC), pp. 363–369 (2008)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
PostgreSQL, http://www.postgresql.org/
Pgpool Wiki, http://www.pgpool.net/
Arasu, A., Babcock, B., Babu, S., Cieslewicz, J., Datar, M., Ito, K., Motwani, R., Srivastava, U., Widom, J.: STREAM: The Stanford Data Stream Management System. Technical Report, Stanford InfoLab (2004)
Abadi, D., Carney, D., Cetintemel, U., Cherniack, M., Convey, C., Erwin, C., Galvez, E., Hatoun, M., Hwang, J.H., Maskey, A., Rasin, A., Singer, A., Stonebraker, M., Tatbul, N., Xing, Y., Yan, R., Zdonik, S.: Aurora: A Data Stream Management System (Demonstration). In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2003 (2003)
Olken, F., Rotem, D., Xu, P.: Random sampling from hash files. In: Proc. SIGMOD 1990, pp. 375–386 (1989)
Olken, F., Rotem, D.: Random sampling from B+ trees. In: Proc. VLDB 1989, pp. 269–277 (1989)
Babcock, B., Chaudhuri, S., Das, G.: Dynamic sample selection for approximate query processing. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD 2003), pp. 539–550. ACM (2003)
Pol, A., Jermaine, C., Arumugam, S.: Maintaining very large random samples using the geometric file. The VLDB Journal 17(5), 997–1018 (2008)
Reeves, G., Nath, J.L.S., Zhao, F.: Managing massive time series streams with multi-scale compressed trickles. Proc. VLDB Endow. 2(1), 97–108 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 IFIP International Federation for Information Processing
About this paper
Cite this paper
Sato, H., Kurasawa, H., Inoue, T., Nakamura, M., Matsumura, H., Koyanagi, K. (2012). Distributed Sampling Storage for Statistical Analysis of Massive Sensor Data. In: Quirchmayr, G., Basl, J., You, I., Xu, L., Weippl, E. (eds) Multidisciplinary Research and Practice for Information Systems. CD-ARES 2012. Lecture Notes in Computer Science, vol 7465. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32498-7_18
Download citation
DOI: https://doi.org/10.1007/978-3-642-32498-7_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32497-0
Online ISBN: 978-3-642-32498-7
eBook Packages: Computer ScienceComputer Science (R0)
