Skip to main content

Accelerating the Neighbor-Joining Algorithm Using the Adaptive Bucket Data Structure

  • Conference paper
Bioinformatics Research and Applications (ISBRA 2008)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4983))

Included in the following conference series:

Abstract

The complexity of the neighbor joining method is determined by the complexity of the search for an optimal pair (”neighbors to join”) performed globally at each iteration. Accelerating the neighbor-joining method requires performing a smarter search for an optimal pair of neighbors, avoiding re-evaluation of all possible pairs of points at each iteration.

We developed an acceleration technique for the neighbor-joining method that significantly decreases complexity for important applications without any change in the neighbor-joining method. This technique utilizes the bucket data structure. The pairs of nodes are arranged in buckets according to values of the goal function δ ij  = u i  + u j  − d ij . Buckets are adaptively re-arranged after each neighbor-joining step. While the pairs of nodes in the top bucket are re-evaluated at every iteration, pairs in lower buckets are accessed more rarely, when the algorithm determines that the elements of the bucket need to be re-evaluated based on new values of δ ij . As a result, only a small portion of candidate pairs of nodes is examined at each iteration.

The algorithm is cache efficient, since the bucket data structures are able to exploit locality and adjust to cache properties.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Saitau, N., Nei, M.: The neighbor-joining method: new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987)

    Google Scholar 

  2. Studier, J.A., Keppler, K.J.: A note on the neighbor-joining algorithm of Saitou and Nei. Molecular Biology and Evolution 5, 729–731 (1988)

    Google Scholar 

  3. Felsenstein, J.: Inferring Phylogenies. Cambridge University Press, Cambridge (2003)

    Google Scholar 

  4. Atteson, K.: The performance of neighbor-joining methods of phylogenetic reconstruction. Algorithmica 25, 251–278 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  5. Tamura, K., Nei, M., Kumar, S.: Prospects for inferring very large phylogenies by using the neighbor-joining method. PNAS 101(30), 11030–11035 (2004)

    Article  Google Scholar 

  6. Bryant, D.: On the uniqueness of the selection criterion in neighbor-joining. Journal of Classification 22(1) (2005)

    Google Scholar 

  7. Desper, R., Gascuel, O.: The minimum-evolution distance-based approach to phylogenetic interference. In: Gascuel, O. (ed.) Mathematics of evolution and phylogeny, pp. 1–32. Oxford University Press, Oxford (2005)

    Google Scholar 

  8. Gascuel, O., Steel, M.: Neighbor-joining revealed. Molecular Biology and Evolution 23, 1997–2000 (2006)

    Article  Google Scholar 

  9. Gascuel, O.: BIONJ: an improved version of the nj algorithm based on a simple model of sequence data. Mol. Biol. Evol. 14, 685–695 (1997)

    Google Scholar 

  10. Bruno, W.J., Socci, N., Halpern, A.L.: Weighted neighbor-joining: a likelihood-based approach to distance-based phyloginy reconstruction. Mol. Biol. Evol. 17, 189–197 (2000)

    Google Scholar 

  11. Yang, Z.: Computational Molecular Evolution. Oxford University Press, Oxford (2006)

    Google Scholar 

  12. Bryant, D.: A classification of consensus methods for phylogenies. In: Janowitz, M., Lapointe, F.J., McMorris, F., Mirkin, B., Roberts, F., (eds.) BioConsensus, DIMACS, Americal Mathematical Society, pp. 163–184 (2003)

    Google Scholar 

  13. Bao, Y., Bolotov, P., Dernovoy, D., Kiryutin, B., Zaslavsky, L., Tatusova, T., Ostell, J., Lipman, D.: The Influenza Virus Resource at the National Center for Biotechnology Information. Journal of Virology 82(2), 596–601 (2008)

    Article  Google Scholar 

  14. Zaslavsky, L., Bao, Y., Tatusova, T.A.: An Adaptive-Resolution Tree Visualization of Large Influenza Virus Sequence Datasets. In: Măndoiu, I.I., Zelikovsky, A. (eds.) ISBRA 2007. LNCS (LNBI), vol. 4463, pp. 192–202. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  15. Mailund, T., Pedersen, C.N.: Quickjoin – fast neighbor-joining tree reconstruction. Bioinformatics 20(17), 3261–3262 (2004)

    Article  Google Scholar 

  16. Mailund, T., Brodal, G.S., Fagerberg, R., Pedersen, C.N.S., Phillips, D.: Recrafting the neighbor-joining method. BMC Bioinformatics 7(29) (2006)

    Google Scholar 

  17. Shenerman, L., Evans, J., Foster, J.A.: Clearcut: fast implementation of relaxed neighbor joining. Bioinformatics 22(22), 2823–2824 (2006)

    Article  Google Scholar 

  18. Evans, J., Shenerman, L., Foster, J.: Relaxed Neighbor-Joining: A Fast Distance-Based Phylogenetic Tree Construction Method. J. Mol. Evol. 62, 785–792 (2006)

    Article  Google Scholar 

  19. Elias, I., Lagergren, J.: Fast neighbor joining. In: Caires, L., Italiano, G.F., Monteiro, L., Palamidessi, C., Yung, M. (eds.) ICALP 2005. LNCS, vol. 3580, pp. 1263–1274. Springer, Heidelberg (2005)

    Google Scholar 

  20. LaMarca, A., Ladner, R.E.: The influence of caches on the performance of sorting. Journal of Algorithms 31, 66–104 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  21. Brodal, G.S., Fagerberg, R., Vinther, K.: Engineering a cache-oblivious sorting algorithm. Journal of Experimental Algorithmics 12, 2.1 (2007)

    Article  Google Scholar 

  22. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. MIT Press and McGraw-Hill (2001)

    Google Scholar 

  23. Dial, R.B.: Algorithm 360: Shortest path forest with topological ordering. Comm. ACM 12, 632–633 (1969)

    Article  Google Scholar 

  24. Wagner, R.A.: A shortest path algorithm for edge-aparse graphs. J. Assoc. Comput. Mach. 23, 50–57 (1976)

    MATH  MathSciNet  Google Scholar 

  25. Dinic, E.A.: Economical algorithms for finding shortest path in network. In: Popkov, Y.S., Shmulyan, B.L., (eds.) Transportation Modeling Systems, The Institute for System Studies, pp. 36–44 (in Russian) (1978)

    Google Scholar 

  26. Denardo, E.V., Fox, B.L.: Shortest-route methods: 1. reaching, pruning, and buckets. Oper. Res. 27, 161–186 (1979)

    Article  MATH  MathSciNet  Google Scholar 

  27. Cherkassky, B.V., Goldberg, A.V., Silverstein, C.: Buckets, heaps, lists, and monotone priority queues. SIAM Journal of Computing 1999 28(4), 1326–1346 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  28. Musser, D.R., Derge, G.J., Saini, A.: STL Tutorial and Reference Guide: C++ Programming with the Standard Template Library, 2nd edn. Addison-Wesley, Reading (2001)

    Google Scholar 

  29. Meyers, S.: Effective STL. Addison-Wesley, Reading (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ion Măndoiu Raj Sunderraman Alexander Zelikovsky

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zaslavsky, L., Tatusova, T.A. (2008). Accelerating the Neighbor-Joining Algorithm Using the Adaptive Bucket Data Structure. In: Măndoiu, I., Sunderraman, R., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2008. Lecture Notes in Computer Science(), vol 4983. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-79450-9_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-79450-9_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-79449-3

  • Online ISBN: 978-3-540-79450-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics