Skip to main content

Estimating Missing Data in Data Streams

  • Conference paper
Advances in Databases: Concepts, Systems and Applications (DASFAA 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4443))

Included in the following conference series:

Abstract

Networks of thousands of sensors present a feasible and economic solution to some of our most challenging problems, such as real-time traffic modeling, military sensing and tracking. Many research projects have been conducted by different organizations regarding wireless sensor networks; however, few of them discuss how to estimate missing sensor data. In this research we present a novel data estimation technique based on association rules derived from closed frequent itemsets generated by sensors. Experimental results compared with the existing techniques using real-life sensor data show that closed itemset mining effectively imputes missing values as well as achieves time and space efficiency.

This research is partially supported by the NASA grant No. NNG05GA30G and a research grant from the United States Department of Defense.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Massive Databases. In: Int’l Conf. on Management of Data (May 1993)

    Google Scholar 

  2. Allison, P.D.: Missing data. Sage, Thousand Oaks (2002)

    MATH  Google Scholar 

  3. Cool, A.L.: A review of methods for dealing with missing data. In: Annual Meeting of the Southwest Educational Research Association, Dallas, TX (2000)

    Google Scholar 

  4. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society (1977)

    Google Scholar 

  5. Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B.: Bayesian Data Analysis. Chapman and Hall, Boca Raton (1995)

    Google Scholar 

  6. Halatchev, M., Gruenwald, L.: Estimating Missing Values in Related Sensor Data Streams. In: Int’l Conf. on Management of Data (January 2005)

    Google Scholar 

  7. Iannacchione, V.G.: Weighted sequential hot deck imputation macros. In: Proceedings of the SAS Users Group International Conference (1982)

    Google Scholar 

  8. Jiang, N., Gruenwald, L.: CFI-Stream: Mining Closed Frequent Itemsets in Data Streams. In: ACM SIGKDD intl. conf. on knowledge discovery and data mining (2006)

    Google Scholar 

  9. Little, R.J.A., Rubin, D.B.: Statistical analysis with missing data. John Wiley and Sons, Chichester (1987)

    MATH  Google Scholar 

  10. McLachlan, G., Thriyambakam, K.: The EM Algorithm and Extensions. John Wiley & Sons, Chichester (1997)

    MATH  Google Scholar 

  11. Rubin, D.: Multiple Imputations for Nonresponce in Surveys. John Wiley & Sons, Chichester (1987)

    Google Scholar 

  12. Rubin, D.: Multiple Imputations after 18 Years. Journal of the American Statistical Association (1996)

    Google Scholar 

  13. Shafer, J.: Model-Based Imputations of Census Short-Form Items. In: Annual Research Conference, Washington, DC, Bureau of the Census (1995)

    Google Scholar 

  14. Taouil, R., Pasquier, N., Bastide, Y., Lakhal, L.: Mining Bases for Association Rules Using Closed Sets. In: International Conference on Data Engineering (2000)

    Google Scholar 

  15. Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Botstein, D., Altman, R.: Missing Value Estimation Methods for DNA Microarrays. Bioinformatics 17 (2001)

    Google Scholar 

  16. Wilkinson & The APA Task Force on Statistical Inference (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ramamohanarao Kotagiri P. Radha Krishna Mukesh Mohania Ekawit Nantajeewarawat

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jiang, N., Gruenwald, L. (2007). Estimating Missing Data in Data Streams. In: Kotagiri, R., Krishna, P.R., Mohania, M., Nantajeewarawat, E. (eds) Advances in Databases: Concepts, Systems and Applications. DASFAA 2007. Lecture Notes in Computer Science, vol 4443. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71703-4_89

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71703-4_89

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71702-7

  • Online ISBN: 978-3-540-71703-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics