Abstract
Networks of thousands of sensors present a feasible and economic solution to some of our most challenging problems, such as real-time traffic modeling, military sensing and tracking. Many research projects have been conducted by different organizations regarding wireless sensor networks; however, few of them discuss how to estimate missing sensor data. In this research we present a novel data estimation technique based on association rules derived from closed frequent itemsets generated by sensors. Experimental results compared with the existing techniques using real-life sensor data show that closed itemset mining effectively imputes missing values as well as achieves time and space efficiency.
This research is partially supported by the NASA grant No. NNG05GA30G and a research grant from the United States Department of Defense.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Massive Databases. In: Int’l Conf. on Management of Data (May 1993)
Allison, P.D.: Missing data. Sage, Thousand Oaks (2002)
Cool, A.L.: A review of methods for dealing with missing data. In: Annual Meeting of the Southwest Educational Research Association, Dallas, TX (2000)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society (1977)
Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B.: Bayesian Data Analysis. Chapman and Hall, Boca Raton (1995)
Halatchev, M., Gruenwald, L.: Estimating Missing Values in Related Sensor Data Streams. In: Int’l Conf. on Management of Data (January 2005)
Iannacchione, V.G.: Weighted sequential hot deck imputation macros. In: Proceedings of the SAS Users Group International Conference (1982)
Jiang, N., Gruenwald, L.: CFI-Stream: Mining Closed Frequent Itemsets in Data Streams. In: ACM SIGKDD intl. conf. on knowledge discovery and data mining (2006)
Little, R.J.A., Rubin, D.B.: Statistical analysis with missing data. John Wiley and Sons, Chichester (1987)
McLachlan, G., Thriyambakam, K.: The EM Algorithm and Extensions. John Wiley & Sons, Chichester (1997)
Rubin, D.: Multiple Imputations for Nonresponce in Surveys. John Wiley & Sons, Chichester (1987)
Rubin, D.: Multiple Imputations after 18 Years. Journal of the American Statistical Association (1996)
Shafer, J.: Model-Based Imputations of Census Short-Form Items. In: Annual Research Conference, Washington, DC, Bureau of the Census (1995)
Taouil, R., Pasquier, N., Bastide, Y., Lakhal, L.: Mining Bases for Association Rules Using Closed Sets. In: International Conference on Data Engineering (2000)
Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Botstein, D., Altman, R.: Missing Value Estimation Methods for DNA Microarrays. Bioinformatics 17 (2001)
Wilkinson & The APA Task Force on Statistical Inference (1999)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jiang, N., Gruenwald, L. (2007). Estimating Missing Data in Data Streams. In: Kotagiri, R., Krishna, P.R., Mohania, M., Nantajeewarawat, E. (eds) Advances in Databases: Concepts, Systems and Applications. DASFAA 2007. Lecture Notes in Computer Science, vol 4443. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71703-4_89
Download citation
DOI: https://doi.org/10.1007/978-3-540-71703-4_89
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71702-7
Online ISBN: 978-3-540-71703-4
eBook Packages: Computer ScienceComputer Science (R0)