Abstract
The complexity of the neighbor joining method is determined by the complexity of the search for an optimal pair (”neighbors to join”) performed globally at each iteration. Accelerating the neighbor-joining method requires performing a smarter search for an optimal pair of neighbors, avoiding re-evaluation of all possible pairs of points at each iteration.
We developed an acceleration technique for the neighbor-joining method that significantly decreases complexity for important applications without any change in the neighbor-joining method. This technique utilizes the bucket data structure. The pairs of nodes are arranged in buckets according to values of the goal function δ ij = u i + u j − d ij . Buckets are adaptively re-arranged after each neighbor-joining step. While the pairs of nodes in the top bucket are re-evaluated at every iteration, pairs in lower buckets are accessed more rarely, when the algorithm determines that the elements of the bucket need to be re-evaluated based on new values of δ ij . As a result, only a small portion of candidate pairs of nodes is examined at each iteration.
The algorithm is cache efficient, since the bucket data structures are able to exploit locality and adjust to cache properties.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Saitau, N., Nei, M.: The neighbor-joining method: new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987)
Studier, J.A., Keppler, K.J.: A note on the neighbor-joining algorithm of Saitou and Nei. Molecular Biology and Evolution 5, 729–731 (1988)
Felsenstein, J.: Inferring Phylogenies. Cambridge University Press, Cambridge (2003)
Atteson, K.: The performance of neighbor-joining methods of phylogenetic reconstruction. Algorithmica 25, 251–278 (1999)
Tamura, K., Nei, M., Kumar, S.: Prospects for inferring very large phylogenies by using the neighbor-joining method. PNAS 101(30), 11030–11035 (2004)
Bryant, D.: On the uniqueness of the selection criterion in neighbor-joining. Journal of Classification 22(1) (2005)
Desper, R., Gascuel, O.: The minimum-evolution distance-based approach to phylogenetic interference. In: Gascuel, O. (ed.) Mathematics of evolution and phylogeny, pp. 1–32. Oxford University Press, Oxford (2005)
Gascuel, O., Steel, M.: Neighbor-joining revealed. Molecular Biology and Evolution 23, 1997–2000 (2006)
Gascuel, O.: BIONJ: an improved version of the nj algorithm based on a simple model of sequence data. Mol. Biol. Evol. 14, 685–695 (1997)
Bruno, W.J., Socci, N., Halpern, A.L.: Weighted neighbor-joining: a likelihood-based approach to distance-based phyloginy reconstruction. Mol. Biol. Evol. 17, 189–197 (2000)
Yang, Z.: Computational Molecular Evolution. Oxford University Press, Oxford (2006)
Bryant, D.: A classification of consensus methods for phylogenies. In: Janowitz, M., Lapointe, F.J., McMorris, F., Mirkin, B., Roberts, F., (eds.) BioConsensus, DIMACS, Americal Mathematical Society, pp. 163–184 (2003)
Bao, Y., Bolotov, P., Dernovoy, D., Kiryutin, B., Zaslavsky, L., Tatusova, T., Ostell, J., Lipman, D.: The Influenza Virus Resource at the National Center for Biotechnology Information. Journal of Virology 82(2), 596–601 (2008)
Zaslavsky, L., Bao, Y., Tatusova, T.A.: An Adaptive-Resolution Tree Visualization of Large Influenza Virus Sequence Datasets. In: Măndoiu, I.I., Zelikovsky, A. (eds.) ISBRA 2007. LNCS (LNBI), vol. 4463, pp. 192–202. Springer, Heidelberg (2007)
Mailund, T., Pedersen, C.N.: Quickjoin – fast neighbor-joining tree reconstruction. Bioinformatics 20(17), 3261–3262 (2004)
Mailund, T., Brodal, G.S., Fagerberg, R., Pedersen, C.N.S., Phillips, D.: Recrafting the neighbor-joining method. BMC Bioinformatics 7(29) (2006)
Shenerman, L., Evans, J., Foster, J.A.: Clearcut: fast implementation of relaxed neighbor joining. Bioinformatics 22(22), 2823–2824 (2006)
Evans, J., Shenerman, L., Foster, J.: Relaxed Neighbor-Joining: A Fast Distance-Based Phylogenetic Tree Construction Method. J. Mol. Evol. 62, 785–792 (2006)
Elias, I., Lagergren, J.: Fast neighbor joining. In: Caires, L., Italiano, G.F., Monteiro, L., Palamidessi, C., Yung, M. (eds.) ICALP 2005. LNCS, vol. 3580, pp. 1263–1274. Springer, Heidelberg (2005)
LaMarca, A., Ladner, R.E.: The influence of caches on the performance of sorting. Journal of Algorithms 31, 66–104 (1999)
Brodal, G.S., Fagerberg, R., Vinther, K.: Engineering a cache-oblivious sorting algorithm. Journal of Experimental Algorithmics 12, 2.1 (2007)
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. MIT Press and McGraw-Hill (2001)
Dial, R.B.: Algorithm 360: Shortest path forest with topological ordering. Comm. ACM 12, 632–633 (1969)
Wagner, R.A.: A shortest path algorithm for edge-aparse graphs. J. Assoc. Comput. Mach. 23, 50–57 (1976)
Dinic, E.A.: Economical algorithms for finding shortest path in network. In: Popkov, Y.S., Shmulyan, B.L., (eds.) Transportation Modeling Systems, The Institute for System Studies, pp. 36–44 (in Russian) (1978)
Denardo, E.V., Fox, B.L.: Shortest-route methods: 1. reaching, pruning, and buckets. Oper. Res. 27, 161–186 (1979)
Cherkassky, B.V., Goldberg, A.V., Silverstein, C.: Buckets, heaps, lists, and monotone priority queues. SIAM Journal of Computing 1999 28(4), 1326–1346 (1999)
Musser, D.R., Derge, G.J., Saini, A.: STL Tutorial and Reference Guide: C++ Programming with the Standard Template Library, 2nd edn. Addison-Wesley, Reading (2001)
Meyers, S.: Effective STL. Addison-Wesley, Reading (2001)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zaslavsky, L., Tatusova, T.A. (2008). Accelerating the Neighbor-Joining Algorithm Using the Adaptive Bucket Data Structure. In: Măndoiu, I., Sunderraman, R., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2008. Lecture Notes in Computer Science(), vol 4983. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-79450-9_12
Download citation
DOI: https://doi.org/10.1007/978-3-540-79450-9_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-79449-3
Online ISBN: 978-3-540-79450-9
eBook Packages: Computer ScienceComputer Science (R0)