Advertisement

“Duplicate deletion in a ring connected, shared-nothing, parallel database system”

  • Abdelguerfi M. 
  • Grant K. 
  • Murphy E. 
  • Patterson W. 
  • Stelly J. 
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 720)

Abstract

The problem of detecting and removing duplicates tuples within a parallel database is one which has far reaching implications. An efficient solution for these problems would have a positive impact in many areas of Computer Science. We have compared the performance of three algorithms for duplicate detection and removal in a ring-connected, shared-nothing, parallel database system. Each algorithm uses a different pre-processing method to reduce the size of the data set which must be processed. It is shown that in a parallel environment (as described above) pre-processing of the database (in the algorithms we've tested) achieves too little reduction in run-time to offset the added cost of its execution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abdelguerfi, M., Sood, A.K., “Computational Complexity of Sorting and Joining Relations with Duplicates,” IEEE Transactions on Knowledge and Data Engineering, in print, to appear December 1991.Google Scholar
  2. 2.
    Frieder, V. A. Topkar, R.K. Karne, and A. K. Sood, “Experimentation with Hypercube Database Engines”, IEEE Micro, February 1992, pp.42–56.Google Scholar
  3. 3.
    Abdelguerfi, M., Sood, A.K., “A Bus Connected Cellular Array Unit for Relational Database Machines”, in Database Machines and Knowledge Base Machines, edited by M. Kitsuregawa, and H. Tanaka, 1988, Kluwer Academic Publishers, pp. 243–256Google Scholar
  4. 4.
    Maller, V.A., “Information Retrieval Using the Content Addressable File Store”, Proceedings of the IFIP-80 Congress, North Holland, 1980, pp.187–190.Google Scholar
  5. 5.
    Stonebroker, M., “The Case for Shared-Nothing,” Database Engineering, Vol. 9, No.1, 1986, pp.Google Scholar
  6. 6.
    Abdelguerfi, M., Lavington, S., “Parallel database and Knowledge-Base Systems,” in: Emerging Trends in Database and Knowledge-Base Machines: the application of parallel architectures to smart information systems, (editors: M. Abdelguerfi and Simon Lavington), IEEE Computer Science Press, Advances Series, (to appear: September 1993).Google Scholar
  7. 7.
    Teuhola, J., Wegner, L., “Minimal Space Average Linear Time Duplication Deletion”, Communications of the ACM, Vol. 34, No. 3, 1991, pp. 62–73CrossRefGoogle Scholar
  8. 8.
    Teuhola, J., Wegner, L., “Technical Correspondence: Duplication Deletion Revisited”, Communications of the ACM, Vol. 35, No. 7, 1992, pp. 99–107.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1993

Authors and Affiliations

  • Abdelguerfi M. 
    • 1
  • Grant K. 
    • 1
  • Murphy E. 
    • 1
  • Patterson W. 
    • 1
  • Stelly J. 
    • 2
  1. 1.Computer Science University of New OrleansNew Orleans
  2. 2.Mathematics University of New OrleansNew Orleans

Personalised recommendations