Abstract
In this paper we deal with the solution of very large instances of the design distribution problem for distributed databases. Traditionally the capacity for solving large scale instances of NP-hard problems has been limited by the available computing resources and the efficiency of the solution algorithms. In contrast, in this paper we present a new solution approach that permits to solve larger instances using the same resources. This approach consists of the application of a systematic method for transforming an instance A into a smaller instance A’ that has a large representativeness of instance A. For validating our approach we used a mathematical model developed by us, whose solution yields the design of a distributed database that minimizes its communication costs. The tests showed that the solution quality of the transformed instances was on the average 10.51% worse than the optimal solution; however, the size reduction was 97.81% on the average. We consider that the principles used in the proposed approach can be applied to the solution of very large instances of NP-hard problems of other problem types.
This research was supported in part by CONACYT and COSNET
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-completeness. Freeman, New York (1979)
Papadimitriou, C., Steiglitz, K.: Combinatorial Optimization: Algorithms and Complexity. Dover Publications, Mineola (1998)
Barr, R.S., Golden, B.L., Kelly, J., Steward, W.R., Resende, M.G.C.: Guidelines for Designing and Reporting on Computational Experiments with Heuristic Methods. In: Proceedings of International Conference on Metaheuristics for Optimization, pp. 1–17. Kluwer Publishing, Norwell (2001)
Michalewicz, Z., Fogel, D.B.: How to Solve It: Modern Heuristics. Springer, Heidelberg (1999)
Pérez, J., Pazos, R.A., Frausto, J., Romero, D., Cruz, L.: Vertical Fragmentation and Allocation in Distributed Databases with Site Capacity Restrictions Using the Threshold Accepting Algorithm. In: Cairó, O., Cantú, F.J. (eds.) MICAI 2000. LNCS, vol. 1793, pp. 75–81. Springer, Heidelberg (2000)
Pérez, J., Pazos, R.A., Frausto, J., Rodríguez, G., Cruz, L., Fraire, H., Mora, G.: Self-Tuning Mechanism for Genetic Algorithms Parameters, an Application to Data-Object Allocation in the Web. In: Laganá, A., Gavrilova, M.L., Kumar, V., Mun, Y., Tan, C.J.K., Gervasi, O. (eds.) ICCSA 2004. LNCS, vol. 3046, pp. 77–86. Springer, Heidelberg (2004)
Ceri, S., Navathe, S., Wiederhold, G.: Distribution Design of Logical Database Schemes. In: Proc. IEEE Transactions on Software Engineering, vol. SE-9(4), pp. 487–503 (1983)
Navathe, S., Ceri, S., Wiederhold, G., Dou, L.: Vertical Partitioning Algorithms for Database Design. In: ACM Trans. On Database Systems, vol. 9(4), pp. 680–710 (1984)
Apers, M.G.: Data Allocation in Distributed Database Systems. ACM Transactions on Database Systems 13(3), 263–304 (1988)
Lin, X., Orlowska, M.: An integer Linear Programming Approach to Data Allocation with the Minimum Total Communication Cost in Distributed Database Systems. Information Sciences 85, 1–10 (1995)
March, S., Rho, S.: Allocating Data and Operations to Nodes in Distributed Database Design. Transactions on Knowledge and Data Engineering 7(2), 305–317 (1995)
Visinescu, C.: Incremental Data Distribution on Internet-Based Distributed Systems: A Spring System Approach: Master of Mathematics in Computer Science thesis, supervised by Tamer Ozsu; University of Waterloo (2003)
Eldar, Y., Lindenbaum, M., Porat, M., Zeevi, Y.: The Farthest Point Strategy for Progressive Image Sampling. IEEE Trans. On Image Processing 6(9), 1305–1315 (1997)
Stamatopoulos, C.: Observations on the Geometrical Propeties of Accuracy Growth in Sampling with Finite Populations. FAO Fisheries Technical Paper 388, Food and Agricultura Organization of the United Nations, Rome (1999) ISSN 0249-9345
Provost, F., Jensen, D., Oates, T.: Efficient Progressive Sampling. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 23–32. ACM Press, New York (1999)
Parthasarathy, S.: Efficient Progressive Sampling for Association Rules. In: Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), pp. 354–361. IEEE Computer Society, Los Alamitos (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pérez O., J. et al. (2005). An Approach for Solving Very Large Scale Instances of the Design Distribution Problem for Distributed Database Systems. In: Ramos, F.F., Larios Rosillo, V., Unger, H. (eds) Advanced Distributed Systems. ISSADS 2005. Lecture Notes in Computer Science, vol 3563. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11533962_4
Download citation
DOI: https://doi.org/10.1007/11533962_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28063-7
Online ISBN: 978-3-540-31674-9
eBook Packages: Computer ScienceComputer Science (R0)