Advertisement

Exploiting Hierarchies for Efficient Detection of Completeness in Stream Data

  • Simon Razniewski
  • Shazia Sadiq
  • Xiaofang Zhou
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9877)

Abstract

In big data settings, the data can often be externally sourced with little or no knowledge of its quality. In such settings, users need to be empowered with the capacity to understand the quality of data sets and implications for use, in order to mitigate the risk of making investments in datasets that will not deliver. In this paper we present an approach for detecting the completeness of high volume stream data generated by a large number of data providers. By exploiting the inherent hierarchies within database attributes, we are able to devise an efficient solution for computing query specific completeness, thereby improving user understanding of implications of using query results based on incomplete data.

Keywords

Logic Programming Query Result Completeness Statement Descriptor Pattern Efficient Promotion 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
  2. 2.
    Abiteboul, S., Dong, L., Etzioni, O., Srivastava, D., Weikum, G., Stoyanovich, J., Suchanek, F.M.: The elephant in the room: getting value from big data. In: WebDB, pp. 1–5. ACM (2015)Google Scholar
  3. 3.
    Ashton, K.: That ‘internet of things’ thing. RFiD J. 22(7), 97–114 (2009)Google Scholar
  4. 4.
    Biswas, J., Naumann, F., Qiu, Q.: Assessing the completeness of sensor data. In: Li Lee, M., Tan, K.-L., Wuwongse, V. (eds.) DASFAA 2006. LNCS, vol. 3882, pp. 717–732. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  5. 5.
    Blakeley, J.A., Coburn, N., Larson, P.: Updating derived relations: Detecting irrelevant and autonomously computable updates. In: VLDB (1986)Google Scholar
  6. 6.
    Bohannon, P., Fan, W., Geerts, F., Jia, X., Kementsietsidis, A.: Conditional functional dependencies for data cleaning. In: ICDE (2007)Google Scholar
  7. 7.
    Brown, P., Link, S.: Probabilistic keys for data quality management. In: Zdravkovic, J., Kirikova, M., Johannesson, P. (eds.) CAiSE 2015. LNCS, vol. 9097, pp. 118–132. Springer, Heidelberg (2015)CrossRefGoogle Scholar
  8. 8.
    Golab, L., Johnson, T.: Consistency in a stream warehouse. In: CIDR, pp. 114–122 (2011)Google Scholar
  9. 9.
    Tamer Özsu, M., Golab, L.: Issues in data stream management. ACM Sigmod Rec. 32(2), 5–14 (2003)CrossRefzbMATHGoogle Scholar
  10. 10.
    Hartig, O., Zhao, J.: Using web data provenance for quality assessment. In: CEUR Workshop Proceedings (2009)Google Scholar
  11. 11.
    Jayawardene, V., Sadiq, S., Indulska, M.: The curse of dimensionality in data quality. In: ACIS, pp. 1–11 (2013)Google Scholar
  12. 12.
    Levy, A.Y.: Obtaining complete answers from incomplete databases. In: VLDB, pp. 402–412 (1996)Google Scholar
  13. 13.
    Levy, A.Y., Sagiv, Y.: Queries independent of updates. In: Proceedings of the VLDB, pp. 171–181 (1993)Google Scholar
  14. 14.
    McAfee, A.: Mastering the three worlds of information technology. Harvard Bus. Rev. 84(11), 141 (2006)Google Scholar
  15. 15.
    Motro, A.: Integrity = Validity + Completeness. ACM TODS 14(4), 480–502 (1989)CrossRefGoogle Scholar
  16. 16.
    Nutt, W., Paramonov, S., Savkovic, O.: Implementing query completeness reasoning. In CIKM, pp. 733–742 (2015)Google Scholar
  17. 17.
    Razniewski, S., Korn, F., Nutt, W., Srivastava, D.: Identifying the extent of completeness of query answers over partially complete databases. In: SIGMOD, pp. 561–576 (2015)Google Scholar
  18. 18.
    Razniewski, S., Montali, M., Nutt, W.: Verification of query completeness over processes. In: Daniel, F., Wang, J., Weber, B. (eds.) BPM 2013. LNCS, vol. 8094, pp. 155–170. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  19. 19.
    Tucker, P., Maier, D., Sheard, T., Fegaras, L., et al.: Exploiting punctuation semantics in continuous data streams. TKDE 15(3), 555–568 (2003)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.Free University of Bozen-BolzanoBolzanoItaly
  2. 2.School of ITEEThe University of QueenslandBrisbaneAustralia

Personalised recommendations