## Abstract

We consider the NP-hard Tree Containment problem that has important applications in phylogenetics. The problem asks if a given single-rooted leaf-labeled network (“phylogenetic network”) *N* “contains” a given leaf-labeled tree (“phylogenetic tree”) *T*. We develop a fast algorithm for the case that *N* is a phylogenetic tree in which multiple leaves might share a label. Generalizing a previously known decomposition scheme lets us leverage this algorithm, yielding linear-time algorithms for so-called “reticulation visible” networks and“nearly stable” networks. While these are special classes of networks, they rank among the most general of the previously considered cases. We also present a dynamic programming algorithm that solves the general problem in \(O(3^{t^*}\cdot |N|\cdot |T|)\) time, where the parameter \(t^*\) is the maximum number of “tree components with unstable roots” in any block of the input network. Notably, \(t^*\) is stronger (that is, smaller on all networks) than the previously considered parameter “number of reticulations” and even the popular parameter “level” of the input network.

## Access this chapter

Tax calculation will be finalised at checkout

Purchases are for personal use only

### Similar content being viewed by others

## Notes

- 1.
Herein, is the maximum out-degree in

*T*and is the maximum in-degree in the result of contracting all arcs between reticulations in*N*. - 2.
- 3.
A biconnected component (or “block”) of a network is a subdigraph induced by the vertices of a biconnected component of its underlying undirected graph, that is, a connected component in the result of removing all bridges.

- 4.
The

*level*of a phylogenetic network is the largest number of reticulations in any biconnected component (of its underlying undirected graph).

## References

Alstrup, S., Harel, D., Lauridsen, P.W., Thorup, M.: Dominators in linear time. SIAM J. Comput.

**28**(6), 2117–2132 (1999)Arenas, M., Valiente, G., Posada, D.: Characterization of reticulate networks based on the coalescent with recombination. Mol. Biol. Evol.

**25**(12), 2517–2520 (2008)Bender, M.A., Farach-Colton, M.: The LCA problem revisited. In: Gonnet, G.H., Viola, A. (eds.) LATIN 2000. LNCS, vol. 1776, pp. 88–94. Springer, Heidelberg (2000). https://doi.org/10.1007/10719839_9

Bentert, M., Malík, J., Weller, M.: Tree containment with soft polytomies. In: Proceedings of the 16th SWAT. LIPIcs, vol. 101, pp. 9:1–9:14. Schloss Dagstuhl (2018)

Bodlaender, H.L., Jansen, B.M.P., Kratsch, S.: Kernelization lower bounds by cross-composition. SIAM J. Discrete Math.

**28**(1), 277–305 (2014)Bordewich, M., Semple, C.: Reticulation-visible networks. Adv. Appl. Math.

**78**, 114–141 (2016)Briggs, P., Torczon, L.: An efficient representation for sparse sets. ACM Lett. Program. Lang. Syst. (LOPLAS)

**2**(1–4), 59–69 (1993)Chan, J.M., Carlsson, G., Rabadan, R.: Topology of viral evolution. Proc. Natl. Acad. Sci.

**110**(46), 18566–18571 (2013)Chandran, B.G., Hochbaum, D.S.: Practical and theoretical improvements for bipartite matching using the pseudoflow algorithm. CoRR abs/1105.1569 (2011)

Cole, R., Farach-Colton, M., Hariharan, R., Przytycka, T., Thorup, M.: An \(o(n \log n)\) algorithm for the maximum agreement subtree problem for binary trees. SIAM J. Comput.

**30**(5), 1385–1404 (2000)Cygan, M., et al.: Parameterized Algorithms. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-319-21275-3

Downey, R.G., Fellows, M.R.: Fundamentals of Parameterized Complexity. Texts in Computer Science. Springer, London (2013). https://doi.org/10.1007/978-1-4471-5559-1

Drucker, A.: New limits to classical and quantum instance compression. SIAM J. Comput.

**44**(5), 1443–1479 (2015)Fakcharoenphol, J., Kumpijit, T., Putwattana, A.: A faster algorithm for the tree containment problem for binary nearly stable phylogenetic networks. In: Proceedings of the 12th JCSSE, pp. 337–342. IEEE (2015)

Gambette, P., Gunawan, A.D., Labarre, A., Vialette, S., Zhang, L.: Solving the tree containment problem in linear time for nearly stable phylogenetic networks. Discrete Appl. Math.

**246**, 62–79 (2018)Gunawan, A.D.M.: Solving the tree containment problem for reticulation-visible networks in linear time. In: Jansson, J., Martín-Vide, C., Vega-Rodríguez, M.A. (eds.) AlCoB 2018. LNCS, vol. 10849, pp. 24–36. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91938-6_3

Gunawan, A.D., DasGupta, B., Zhang, L.: A decomposition theorem and two algorithms for reticulation-visible networks. Inf. Comput.

**252**, 161–175 (2017)Gunawan, A.D., Lu, B., Zhang, L.: A program for verification of phylogenetic network models. Bioinformatics

**32**(17), i503–i510 (2016)Gunawan, A.D., Lu, B., Zhang, L.: Fast methods for solving the cluster containment problem for phylogenetic networks. CoRR abs/1801.04498 (2018)

Gusfield, D.: ReCombinatorics: The Algorithmics of Ancestral Recombination Graphs and Explicit Phylogenetic Networks. MIT Press, Cambridge (2014)

Hopcroft, J., Tarjan, R.: Algorithm 447: efficient algorithms for graph manipulation. Commun. ACM

**16**(6), 372–378 (1973)Huson, D.H., Rupp, R., Scornavacca, C.: Phylogenetic Networks: Concepts, Algorithms and Applications. Cambridge University Press, New York (2010)

Kanj, I.A., Nakhleh, L., Than, C., Xia, G.: Seeing the trees and their branches in the network is hard. Theor. Comput. Sci.

**401**(1–3), 153–164 (2008)Lengauer, T., Tarjan, R.E.: A fast algorithm for finding dominators in a flowgraph. ACM Trans. Program. Lang. Syst.

**1**(1), 121–141 (1979)Treangen, T.J., Rocha, E.P.: Horizontal transfer, not duplication, drives the expansion of protein families in prokaryotes. PLoS Genet.

**7**(1), e1001284 (2011)Van Iersel, L., Semple, C., Steel, M.: Locating a tree in a phylogenetic network. Inf. Process. Lett.

**110**(23), 1037–1043 (2010)

## Acknowledgement

Thanks to Celine Scornavacca for her thorough proof-reading.

## Author information

### Authors and Affiliations

### Corresponding author

## Editor information

### Editors and Affiliations

## Appendix

### Appendix

### Proof

*(Proof of correctness of Rule* 2*).* Let \(S^v\) be a subdivision of \(T_v\) in *P* and let \((N',T')\) be the result of applying Rule 2 to (*N*, *T*).

“\(\Leftarrow \)”: Let \(N'\) contain a subdivision \(S'\) of \(T'\). It suffices to show that the result *S* of replacing \(\rho \left( P\right) \) with \(S^v\) in \(S'\) is contained in *N* since *S* is clearly a subdivision of *T*. Since \(S^v\) is contained in *P*, it suffices to show that \(S'\) and \(S^v\) are vertex disjoint (except for \(\rho \left( P\right) \)). Towards a contradiction, assume that \(S'\) and \(S^v\) both contain a vertex \(u\ne \rho \left( P\right) \) of *P*. Since \(\mathcal {L}(S')\) and \(\mathcal {L}(S^v)\) are disjoint, *u* is ancestor to at least two different leaves in *N*. Thus, *u* is in the tip of *P*, contradicting that *u* is in \(N'\).

“\(\Rightarrow \)”: Let *N* contain a subdivision *S* of *T* and let . Since \(\rho \left( P\right) \) is stable on *c* and \(c\in \mathcal {L}(T_v)\), we have \(u\le _N \rho \left( P\right) \), implying \(\mathcal {L}(S_{\rho \left( P\right) })\supseteq \mathcal {L}(T_v)\). Further, maximality of *v* implies \(\mathcal {L}(S_{\rho \left( P\right) })\subseteq \mathcal {L}(T_v)\). Let \(S'\) result from *S* by contracting \(S_{\rho \left( P\right) }\) into a single vertex and labeling this vertex \(\lambda \). Since \(\mathcal {L}(S_{\rho \left( P\right) })=\mathcal {L}(T_v)\), we know that \(S'\) is a subdivision of \(T'\) and it suffices to show that \(N'\) contains \(S'\). To do this, we show that all vertices of \(S'\) are in \(N'\). Assume towards a contradiction that \(S'\) contains a vertex *w* that is not in \(N'\). Then, *w* is in the tip of *P*, implying \(\mathcal {L}(S_w)\subseteq \mathcal {L}(S_{\rho \left( P\right) })\). Thus, *w* is a vertex of \(S_{\rho \left( P\right) }\) contradicting *w* being in \(S'\).

## Rights and permissions

## Copyright information

© 2018 Springer Nature Switzerland AG

## About this paper

### Cite this paper

Weller, M. (2018). Linear-Time Tree Containment in Phylogenetic Networks. In: Blanchette, M., Ouangraoua, A. (eds) Comparative Genomics. RECOMB-CG 2018. Lecture Notes in Computer Science(), vol 11183. Springer, Cham. https://doi.org/10.1007/978-3-030-00834-5_18

### Download citation

DOI: https://doi.org/10.1007/978-3-030-00834-5_18

Published:

Publisher Name: Springer, Cham

Print ISBN: 978-3-030-00833-8

Online ISBN: 978-3-030-00834-5

eBook Packages: Computer ScienceComputer Science (R0)