“Duplicate deletion in a ring connected, shared-nothing, parallel database system”
The problem of detecting and removing duplicates tuples within a parallel database is one which has far reaching implications. An efficient solution for these problems would have a positive impact in many areas of Computer Science. We have compared the performance of three algorithms for duplicate detection and removal in a ring-connected, shared-nothing, parallel database system. Each algorithm uses a different pre-processing method to reduce the size of the data set which must be processed. It is shown that in a parallel environment (as described above) pre-processing of the database (in the algorithms we've tested) achieves too little reduction in run-time to offset the added cost of its execution.
Unable to display preview. Download preview PDF.
- 1.Abdelguerfi, M., Sood, A.K., “Computational Complexity of Sorting and Joining Relations with Duplicates,” IEEE Transactions on Knowledge and Data Engineering, in print, to appear December 1991.Google Scholar
- 2.Frieder, V. A. Topkar, R.K. Karne, and A. K. Sood, “Experimentation with Hypercube Database Engines”, IEEE Micro, February 1992, pp.42–56.Google Scholar
- 3.Abdelguerfi, M., Sood, A.K., “A Bus Connected Cellular Array Unit for Relational Database Machines”, in Database Machines and Knowledge Base Machines, edited by M. Kitsuregawa, and H. Tanaka, 1988, Kluwer Academic Publishers, pp. 243–256Google Scholar
- 4.Maller, V.A., “Information Retrieval Using the Content Addressable File Store”, Proceedings of the IFIP-80 Congress, North Holland, 1980, pp.187–190.Google Scholar
- 5.Stonebroker, M., “The Case for Shared-Nothing,” Database Engineering, Vol. 9, No.1, 1986, pp.Google Scholar
- 6.Abdelguerfi, M., Lavington, S., “Parallel database and Knowledge-Base Systems,” in: Emerging Trends in Database and Knowledge-Base Machines: the application of parallel architectures to smart information systems, (editors: M. Abdelguerfi and Simon Lavington), IEEE Computer Science Press, Advances Series, (to appear: September 1993).Google Scholar