Skip to main content
Log in

A new vertical fragmentation algorithm based on ant collective behavior in distributed database systems

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Considering the existing massive volumes of data processed nowadays and the distributed nature of many organizations, there is no doubt how vital the need is for distributed database systems. In such systems, the response time to a transaction or a query is highly affected by the distribution design of the database system, particularly its methods for fragmentation, replication, and allocation data. According to the relevant literature, from the two approaches to fragmentation, namely horizontal and vertical fragmentation, the latter requires the use of heuristic methods due to it being NP-Hard. Currently, there are a number of different methods of providing vertical fragmentation, which normally introduce a relatively high computational complexity or do not yield optimal results, particularly for large-scale problems. In this paper, because of their distributed and scalable nature, we apply swarm intelligence algorithms to present an algorithm for finding a solution to vertical fragmentation problem, which is optimal in most cases. In our proposed algorithm, the relations are tried to be fragmented in such a way so as not only to make transaction processing at each site as much localized as possible, but also to reduce the costs of operations. Moreover, we report on the experimental results of comparing our algorithm with several other similar algorithms to show that ours outperforms the other algorithms and is able to generate a better solution in terms of the optimality of results and computational complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Adl KR, RouhaniRankoohi SMT (2009) A new ant colony optimization based algorithm for data allocation problem in distributed databases. J Knowl Inf Syst (Springer)

  2. Babad M (1977) A record and file partitioning model. Commun ACM 20(1): 29–31

    Article  Google Scholar 

  3. Benner H (1967) On designing generalized file records for management information systems. In: Proceedings of the fall joint computer conference, pp 291–303

  4. Bonabeau E, Dorigo M, Theraulaz G (1999) Swarm intelligence: from natural to artificial systems, institute studies in the sciences of complexity. Oxford University Press, Santa Fe

    Google Scholar 

  5. Ceri S, Plagatti G (1984) Distributed databases principles and systems. McGraw-Hill Book Company, New York

    Google Scholar 

  6. Ceri S, Navathe SB, Weiderhold G (1983) Distribution design of logical database schemas. IEEE Trans Softw Eng 9(4): 487–503

    Article  Google Scholar 

  7. Ceri S, Pernici S, Weiderhold G (1989) Optimization problems and solution methods in the design of data distribution. Inf Sci 14(3): 261–272

    Google Scholar 

  8. Chakravarthyt S, Varadarajan R, Navathe SB, Muthuraj J (1993) A formal approach to the vertical partitioning problem in distributed database design. In: Proceedings of parallel and distributed information systems (PDIS-2) IEEE, pp 26–34

  9. Chu WW, Fellow IEEE, Ieong IT (1993) A transaction-based approach to vertical partitioning for relational database systems. IEEE Trans Softw Eng 19(8): 804–8012

    Article  Google Scholar 

  10. Cornell D, Yu P (1987) A vertical partitioning algorithm for relational databases. In: Proceeding of third international conference on data engineering, pp 30–35

  11. Cui X, Potok TE, Palathingal P (2005) Document clustering using particle swarm optimization. In: IEEE transaction on swarm intelligence symposium(SIS) proceedings, pp 185–191

  12. Day H (1956) An optimal extracting from a multiple file data storage system: an application of integer programming. Oper Res 13(3): 482–494

    Article  Google Scholar 

  13. Dearnley P (1974) Model of a self-organizing data management system. Comput J 17(1)

  14. Deneuborg JL (1990) The dynamics of collective sorting robot-like ants and ant-like robots. In: 1st international conference on simulation of adaptive behaviour: from animals to animats, vol 1. MIT Press, pp 356–363

  15. Du J, Alhajj R, Barker K (2006) Genetic algorithms based approach to database vertical partition. J Intell Inf Syst 26: 167–183

    Article  Google Scholar 

  16. Eisner M, Severance D (1976) Mathematical techniques for efficient record segmentation in large shared databases. J ACM 23(4)

  17. Falkenauer E (1998) Genetic algorithms and grouping problems. Wiley, England

    Google Scholar 

  18. Garey MR, Johnson DS (1979) Computers and intractability: a guide to the theory of NP-completeness. W.H. Freeman, USA

    MATH  Google Scholar 

  19. Hammer M, Niamir B (1979) A heuristic approach to attribute partitioning. In: Proceedings ACM SIGMOD international conference on management of data

  20. Handl J, Meyer B (2002) Improved ant-based clustering and sorting in a document retrieval interface. In: Proceeding of the 7th internatioanl conference on parallel problem solving from nature, pp 913–923

  21. Handl J, Knowles J, Dorigo M (2003) On the performance of ant-based clustering. Front Artif Intell Appl 104: 204–213

    Google Scholar 

  22. Hoe K, Lai W, Tai T (2002) Homogenous ants for web document similarity modeling and categorization. In: Proceedings of the third international workshop on ant algorithms, LNCS, vol 2463. Springer, Berlin, pp 256–261

  23. Hoffer J (1976) An integer programming formulation of computer database design problems. Inf Sci 11: 29–48

    Article  Google Scholar 

  24. Hoffer J, Severance D (1975) The uses of cluster analysis in physical database design. In: Proceeding of 1st international conference on VLDB, Framingham, pp 69–86

  25. Jing L, Ng MK, Huang JZ (2010) Knowledge-based vector space model for text clustering. J Knowl Inf Syst (Springer) 25(1): 35–55

    Article  Google Scholar 

  26. Kennedy R (1973) The use of access frequencies in database organization. PhD Dissertation, The Wharton School, University of Pennsylvania

  27. Kennedy SR (1972) A file partition model. Technical report in information science

  28. Kranen P, Assent I, Baldauf C, Seidl T (2010) The ClusTree: indexing micro-clusters for anytime stream mining. J Knowl Inf Syst (Springer)

  29. Lumer E, Faieta B (1994) Diversity and adaption in populations of clustering ants. In: 3rd international conference on simulation of adaptive behaviour: from animals to animats, vol 3. MIT Press

  30. Lumer E, Faieta B (1995) Exploratory database analysis via self-organization. Unpublished manuscript. Results summarized in

  31. March S, Severance D (1977) The determination of efficient record segmentation and blocking factors for share data files. ACM Trans Database Syst 2(3): 279–296

    Article  Google Scholar 

  32. McCormick W, Schweitzer P, White T (1972) Problem decomposition and data reorganization by a clustering technique. Oper Res

  33. Navathe S, Ceri S, Wiederhold G, Dou J (1984) Vertical partitioning algorithms for database design. ACM Trans Database Syst 9(4)

  34. Navathe SB, Ra M (1989) Vertical partitioning for database design: a graphical algorithm. ACM SIGMOD Record 18(2): 440–450

    Article  Google Scholar 

  35. Ni X, Quan X, Lu X, Wenyin L, Hua B (2010) Short text clustering by finding core terms. J Knowl Inf Syst (Springer)

  36. Ozsu MT, Valduriez P (1999) Principles of distributed database systems. Printice Hall, Englewood Cliffs

    Google Scholar 

  37. Pérez J, Pazos R, Frausto J, Romero D, Cruz L (1998) Vertical fragmentation and allocation in distributed databases with site capacity restrictions using the threshold accepting algorithm. Parallel Distributed Comput Syst, Las Vegas, pp 210–213

    Google Scholar 

  38. Ramos V, Merelo JJ (2002) Self-organized stigmergic document maps: environments as a mechanism for context learning. In: Proceedings of the first Spanish conference on evolutionary and bio-inspired algorithm, pp 284–293

  39. Sakuma J, Kobayashi S (2009) Large-scale k-means clustering with user-centric privacy-preservation. J Knowl Inf Syst (Springer)

  40. Sarathy R, Shetty B, Sen A (1997) A constrained nonlinear 0–1 program for data allocation. Eur I Oper Res 102: 626–647

    Article  MATH  Google Scholar 

  41. Schkolnic M (1977) A clustering algorithm for hierarchical structures. ACM TODS 1(2): 27–44

    Google Scholar 

  42. Seppala Y (1967) Definition of extraction files and their optimization by zero-one programming. BIT 7(3): 206–215

    Article  Google Scholar 

  43. Song SK, Gorla N (2000) A genetic algorithm for vertical fragmentation and access path selection. Comput J 43(1)

  44. Stocker M, Dearnley A (1973) Self-organizing data management systems. Comput J 16(2)

  45. Stutzle T (1997) MAX-MIN Ant system for the qadratic assignment problem. Technical report AIDA-97-4, FG Intellectik, FB Informatik, TU Darmstadt, Germany

  46. Stutzle T, Dorigo M (1999) ACO algorithms for the quadratic assignment problem. In: Corne D, Dorigo M, Glover F New ideas in optimization. McGraw-Hill, Maidenhead

  47. Taillard ED (1995) Comparison of iterative searches for the quadratic assignment problem. Locat Sci 3: 87–105

    Article  MATH  Google Scholar 

  48. Takacs B, Demiris Y (2009) Spectral clustering in multi-agent systems. J Knowl Inf Syst (Springer)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mehdi Goli.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Goli, M., Rouhani Rankoohi, S.M.T. A new vertical fragmentation algorithm based on ant collective behavior in distributed database systems. Knowl Inf Syst 30, 435–455 (2012). https://doi.org/10.1007/s10115-011-0384-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-011-0384-6

Keywords

Navigation