Advertisement

An Approach to Identify n-wMVD for Eliminating Data Redundancy

  • Sangeeta Viswanadham
  • Vatsavayi Valli Kumari
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 182)

Abstract

Data Cleaning is a process for determining whether two or more records defined differently in database, represent the same real world object. Data Cleaning is a vital function in data warehouse preprocessing. It is found that the problem of duplication /redundancy is encountered frequently when large amounts of data collected from different sources is put in the warehouse. Eliminating redundancy in the data warehouse resolves conflicts in making wrong decisions. Data cleaning is also used to solve problem of “wastage of storage space”. One way of eliminating redundancy is by retrieving similar records using tokens formed on prominent attributes. Another approach is to use Conditional Functional Dependencies (CFD’s) to capture the consistency of data by combining semantically related data. Existing work on data cleaning do not deal with the case of multi-valued attributes. This paper deals with nesting based weak multi-valued dependencies (n-wMVD) which can handle multi-valued attributes and redundancy removal. Our contributions are of two fold (i) An approach to convert the given database to wMVD (ii) Implementation of n-wMVD to eliminate redundancy. The applicability of our approach was tested. The results are encouraging and are presented in the paper.

Keywords

Conditional Functional Dependencies (CFD) weak Multi-valued Dependencies (wMVD) nesting based weak Multi-valued Dependencies (n-wMVD) 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Viswanadham, S., Kumari, V.V.: Eliminating Data Redundancy using Nesting based wMVD. In: 2012 4th International Conference on Electronics Computer Technology, ICECT. IEEE Publications (April 2012)Google Scholar
  2. 2.
    Rahm, E., Hong, H.D.: Data Cleaning: Problems and Current Approaches. In: IEEE Techn. Bulletin on Data Engineering, University of Leipzig, Germany (2000)Google Scholar
  3. 3.
    Ezeife, C.I., Ohanekwu, T.E.: Use of Smart Tokens in Cleaning Integrated Warehouse Data. International Journal of Data Warehousing & Mining (April-June 2005)Google Scholar
  4. 4.
    Fan, W., Kementsietsidis, A.: Conditional Functional Dependencies for Capturing Data Inconsistencies. ACM Transactions on Data Base Systems (TODS) 33(2) (June 2008)Google Scholar
  5. 5.
    Hartmann, S., Link, S.: On Inferences of Weak Multivalued Dependencies. Fundamenta Informaticae 92, 83–102 (2009), doi:10.3233/FI-2009-0067MathSciNetMATHGoogle Scholar
  6. 6.
    Fischer, P., Van Gucht, D.: Weak Multivalued Dependencies. In: PoDS Conference. ACM (1984)Google Scholar
  7. 7.
    Korth, H., Roth, M.: Query Languages for Nested Relational Databases. In: Abiteboul, S., Schek, H.-J., Fischer, P.C. (eds.) NF2 1987. LNCS, vol. 361, pp. 190–204. Springer, Heidelberg (1989)CrossRefGoogle Scholar
  8. 8.
    Fagin, R.: Multivalued Dependencies and a New Normal Form for Relational Databases. Trans. ACM Database Syst. (1977)Google Scholar
  9. 9.
    Hartmann, S., Link, S.: Characterising nested database dependencies by fragments of propositional logic. Annals of Pure and Applied Logic Journal of Science Direct 152, 84–106 (2008)MathSciNetMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Pydah College of Engg & TechVisakhapatnamIndia
  2. 2.Andhra UniversityVisakhapatnamIndia

Personalised recommendations