Advertisement

An Approach for Solving Very Large Scale Instances of the Design Distribution Problem for Distributed Database Systems

  • Joaquín Pérez O.
  • Rodolfo A. Pazos R.
  • Juan Frausto-Solís
  • Gerardo Reyes S.
  • Rene Santaolaya S.
  • Héctor J. Fraire H.
  • Laura Cruz R.
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3563)

Abstract

In this paper we deal with the solution of very large instances of the design distribution problem for distributed databases. Traditionally the capacity for solving large scale instances of NP-hard problems has been limited by the available computing resources and the efficiency of the solution algorithms. In contrast, in this paper we present a new solution approach that permits to solve larger instances using the same resources. This approach consists of the application of a systematic method for transforming an instance A into a smaller instance A’ that has a large representativeness of instance A. For validating our approach we used a mathematical model developed by us, whose solution yields the design of a distributed database that minimizes its communication costs. The tests showed that the solution quality of the transformed instances was on the average 10.51% worse than the optimal solution; however, the size reduction was 97.81% on the average. We consider that the principles used in the proposed approach can be applied to the solution of very large instances of NP-hard problems of other problem types.

Keywords

Large Instance Geometric Program Original Instance Large Scale Instance Distribute Database System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-completeness. Freeman, New York (1979)zbMATHGoogle Scholar
  2. 2.
    Papadimitriou, C., Steiglitz, K.: Combinatorial Optimization: Algorithms and Complexity. Dover Publications, Mineola (1998)zbMATHGoogle Scholar
  3. 3.
    Barr, R.S., Golden, B.L., Kelly, J., Steward, W.R., Resende, M.G.C.: Guidelines for Designing and Reporting on Computational Experiments with Heuristic Methods. In: Proceedings of International Conference on Metaheuristics for Optimization, pp. 1–17. Kluwer Publishing, Norwell (2001)Google Scholar
  4. 4.
    Michalewicz, Z., Fogel, D.B.: How to Solve It: Modern Heuristics. Springer, Heidelberg (1999)Google Scholar
  5. 5.
    Pérez, J., Pazos, R.A., Frausto, J., Romero, D., Cruz, L.: Vertical Fragmentation and Allocation in Distributed Databases with Site Capacity Restrictions Using the Threshold Accepting Algorithm. In: Cairó, O., Cantú, F.J. (eds.) MICAI 2000. LNCS, vol. 1793, pp. 75–81. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  6. 6.
    Pérez, J., Pazos, R.A., Frausto, J., Rodríguez, G., Cruz, L., Fraire, H., Mora, G.: Self-Tuning Mechanism for Genetic Algorithms Parameters, an Application to Data-Object Allocation in the Web. In: Laganá, A., Gavrilova, M.L., Kumar, V., Mun, Y., Tan, C.J.K., Gervasi, O. (eds.) ICCSA 2004. LNCS, vol. 3046, pp. 77–86. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  7. 7.
    Ceri, S., Navathe, S., Wiederhold, G.: Distribution Design of Logical Database Schemes. In: Proc. IEEE Transactions on Software Engineering, vol. SE-9(4), pp. 487–503 (1983)Google Scholar
  8. 8.
    Navathe, S., Ceri, S., Wiederhold, G., Dou, L.: Vertical Partitioning Algorithms for Database Design. In: ACM Trans. On Database Systems, vol. 9(4), pp. 680–710 (1984)Google Scholar
  9. 9.
    Apers, M.G.: Data Allocation in Distributed Database Systems. ACM Transactions on Database Systems 13(3), 263–304 (1988)CrossRefGoogle Scholar
  10. 10.
    Lin, X., Orlowska, M.: An integer Linear Programming Approach to Data Allocation with the Minimum Total Communication Cost in Distributed Database Systems. Information Sciences 85, 1–10 (1995)CrossRefMathSciNetGoogle Scholar
  11. 11.
    March, S., Rho, S.: Allocating Data and Operations to Nodes in Distributed Database Design. Transactions on Knowledge and Data Engineering 7(2), 305–317 (1995)CrossRefGoogle Scholar
  12. 12.
    Visinescu, C.: Incremental Data Distribution on Internet-Based Distributed Systems: A Spring System Approach: Master of Mathematics in Computer Science thesis, supervised by Tamer Ozsu; University of Waterloo (2003)Google Scholar
  13. 13.
    Eldar, Y., Lindenbaum, M., Porat, M., Zeevi, Y.: The Farthest Point Strategy for Progressive Image Sampling. IEEE Trans. On Image Processing 6(9), 1305–1315 (1997)CrossRefGoogle Scholar
  14. 14.
    Stamatopoulos, C.: Observations on the Geometrical Propeties of Accuracy Growth in Sampling with Finite Populations. FAO Fisheries Technical Paper 388, Food and Agricultura Organization of the United Nations, Rome (1999) ISSN 0249-9345 Google Scholar
  15. 15.
    Provost, F., Jensen, D., Oates, T.: Efficient Progressive Sampling. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 23–32. ACM Press, New York (1999)CrossRefGoogle Scholar
  16. 16.
    Parthasarathy, S.: Efficient Progressive Sampling for Association Rules. In: Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), pp. 354–361. IEEE Computer Society, Los Alamitos (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Joaquín Pérez O.
    • 1
  • Rodolfo A. Pazos R.
    • 1
  • Juan Frausto-Solís
    • 2
  • Gerardo Reyes S.
    • 1
  • Rene Santaolaya S.
    • 1
  • Héctor J. Fraire H.
    • 3
  • Laura Cruz R.
    • 3
  1. 1.Centro Nacional de Investigación y Desarrollo Tecnológico (CENIDET)CuernavacaMéxico
  2. 2.ITESMTemixcoMéxico
  3. 3.Instituto Tecnológico de Ciudad MaderoMéxico

Personalised recommendations