Skip to main content

Repair Position Selection for Inconsistent Data

  • Conference paper
  • First Online:
  • 792 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10627))

Abstract

Inconsistent data indicates that there is conflicted information in the data, which can be formalized as the violations of given semantic constraints. To improve data quality, repair means to make the data consistent by modifying the original data. Using the feedbacks of users to direct the repair operations is a popular solution. Under the setting of big data, it is unrealistic to let users give their feedbacks on the whole data set. In this paper, the repair position selection problem (RPS for short) is formally defined and studied. Intuitively, the RPS problem tries to find an optimal set of repair positions under the limitation of repairing cost such that we can obtain consistent data as many as possible. First, the RPS problem is formalized. Then, by considering three different repair strategies, the complexities and approximabilities of the corresponding RPS problems are studied.

This work was supported in part by the General Program of the National Natural Science Foundation of China under grants 61502121, 61402130, 61772157, U1509216, the China Postdoctoral Science Foundation under grant 2016M590284, the Fundamental Research Funds for the Central Universities (Grant No. HIT.NSRIF.201649), and Heilongjiang Postdoctoral Foundation (Grant No. LBH-Z15094).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Arenas, M., Bertossi, L., Chomicki, J.: Consistent query answers in inconsistent databases. In: Proceedings of the Eighteenth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 1999), New York, pp. 68–79. ACM (1999)

    Google Scholar 

  2. Bohannon, P., Fan, W., Flaster, M., Rastogi, R.: A cost-based model and effective heuristic for repairing constraints by value modification. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data (SIGMOD 2005), New York, pp. 143–154. ACM (2005)

    Google Scholar 

  3. Bohannon, P., Fan, W., Geerts, F., Jia, X., Kementsietsidis, A.: Conditional functional dependencies for data cleaning. In: 2007 IEEE 23rd International Conference on Data Engineering, pp. 746–755, April 2007

    Google Scholar 

  4. Cai, Z., Heydari, M., Lin, G.: Iterated local least squares microarray missing value imputation. J. Bioinform. Computat. Biol. 4, 935–958 (2006)

    Article  Google Scholar 

  5. Chiang, F., Miller, R.J.: A unified model for data and constraint repair. In: Proceedings of the 2011 IEEE 27th International Conference on Data Engineering (ICDE 2011), Washington, DC, pp. 446–457. IEEE Computer Society (2011)

    Google Scholar 

  6. Chomicki, J., Marcinkowski, J.: Minimal-change integrity maintenance using tuple deletions. Inf. Comput. 197, 90–121 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  7. Cong, G., Fan, W., Geerts, F., Jia, X., Ma, S.: Improving data quality: consistency and accuracy. In: Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB 2007), pp. 315–326. VLDB Endowment (2007)

    Google Scholar 

  8. Decker, H., Martinenghi, D.: Inconsistency-tolerant integrity checking. IEEE Trans. Knowl. Data Eng. 23, 218–234 (2011)

    Article  Google Scholar 

  9. Eiter, T., Fink, M., Greco, G., Lembo, D.: Repair localization for query answering from inconsistent databases. ACM Trans. Database Syst. 33, 10:1–10:51 (2008)

    Article  Google Scholar 

  10. Feige, U.: A threshold of ln n for approximating set cover. J. ACM 45, 634–652 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  11. Feige, U., Peleg, D., Kortsarz, G.: The dense k-subgraph problem. Algorithmica 29, 410–421 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  12. Feige, U., Seltser, M.: On the densest k-subgraph problems, technical report, The Weizmann Institute, Jerusalem, Israel (1997)

    Google Scholar 

  13. Fuxman, A., Miller, R.J.: First-order query rewriting for inconsistent databases. J. Comput. Syst. Sci. 73, 610–635 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  14. Greco, S., Sirangelo, C., Trubitsyna, I., Zumpano, E.: Preferred repairs for inconsistent databases. In: Proceedings of the Seventh International Database Engineering and Applications Symposium, pp. 202–211, July 2003

    Google Scholar 

  15. Kuhn, H.: The Hungarian method for the assignment problem. Nav. Res. Logist. Q. 2, 83–97 (1955)

    Article  MATH  MathSciNet  Google Scholar 

  16. Li, J., Liu, X.: An important aspect of big data: data usability. J. Comput. Res. Dev. 50, 1147–1162 (2013)

    Google Scholar 

  17. Lopatenko, A., Bertossi, L.: Complexity of consistent query answering in databases under cardinality-based and incremental repair semantics. In: Schwentick, T., Suciu, D. (eds.) ICDT 2007. LNCS, vol. 4353, pp. 179–193. Springer, Heidelberg (2006). https://doi.org/10.1007/11965893_13

    Chapter  Google Scholar 

  18. Lopatenko, A., Bravo, L.: Efficient approximation algorithms for repairing inconsistent databases. In: 2007 IEEE 23rd International Conference on Data Engineering, pp. 216–225, April 2007

    Google Scholar 

  19. Miao, D., Liu, X., Li, J.: On the complexity of sampling query feedback restricted database repair of functional dependency violations. Theor. Comput. Sci. 609, 594–605 (2016)

    Article  MATH  MathSciNet  Google Scholar 

  20. Staworko, S., Chomicki, J.: Consistent query answers in the presence of universal constraints. Inf. Syst. 35, 1–22 (2010)

    Article  Google Scholar 

  21. Wang, Y., Cai, Z., Stothard, P., Moore, S., Goebel, R., Wang, L., Lin, G.: Fast accurate missing SNP genotype local imputation. BMC Res. Notes 5, 404 (2012)

    Article  Google Scholar 

  22. West, D.B.: Introduction to Graph Theory. Prentice Hall, New York (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xianmin Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, X., Li, Y., Li, J. (2017). Repair Position Selection for Inconsistent Data. In: Gao, X., Du, H., Han, M. (eds) Combinatorial Optimization and Applications. COCOA 2017. Lecture Notes in Computer Science(), vol 10627. Springer, Cham. https://doi.org/10.1007/978-3-319-71150-8_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-71150-8_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-71149-2

  • Online ISBN: 978-3-319-71150-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics